Troubleshooting
Troubleshooting¶
This comprehensive troubleshooting guide addresses common issues across the Firefly documentation.
Development Environment Issues¶
Node Version Requirements¶
Error: Build failures or compatibility issues
Fix: Ensure Node.js version >= 20.15.0
nvm install 20
nvm use 20
node --version # Should show v20.15.0 or higher
Firefly Alias Not Found¶
Error: firefly: command not found
Fix: Make sure you added the firefly alias to your shell profile and reloaded it
# Reload your shell configuration
source ~/.zshrc # or ~/.bashrc
# Then run the setup
firefly
Authentication Expired or CodeArtifact Issues¶
Error: Authentication errors or CodeArtifact login failures
Fix: Simply run the firefly alias again
firefly
Commitizen Not Working¶
Error: git cz command not found or not working
Fix: Reinstall globally
npm install -g commitizen cz-conventional-changelog
echo '{ "path": "cz-conventional-changelog" }' > ~/.czrc
Deployment Issues¶
MetalFly API Deployment Required¶
Error: deploy:Core:dev [region] error when submitting MR
Fix: Deploy MetalFly API to all regions before submitting changes
cd metalfly_api/
yarn dev-deploy:all
Uncommitted Changes Error¶
Error: Deployment fails due to uncommitted changes
Fix: Commit all local changes before developer stack deployment, including:
Packed .tgz files
package.json updates
yarn.lock file changes
git add .
git commit -m "feat: local development changes"
CDK Installation Issues¶
Error: CDK-related deployment failures
Fix: Ensure yarn install is run in both root and cdk directories
# In project root
yarn install --latest
# In cdk directory
cd cdk
yarn install --latest
Also verify AWS_PROFILE consistency between local environment and cdk/package.json.
Local Development Issues¶
Docker Compose Issues¶
Error: Command "nodemon" not found when running docker compose up in MetalFly
Fix: Run yarn install first
yarn install
docker compose up
Docker Compose Command Not Valid¶
Error: docker compose up or docker-compose up shows “not a valid command”
Fix: If you are running docker compose (with the space), double check your docker CLI installation. If you are running docker-compose (with the hyphen), trying reinstalling:
brew install docker-compose
docker-compose up
JavaScript Heap Out of Memory (Local Development)¶
Error: “JavaScript heap out of memory” when spinning up MetalFly Core
Fix for Docker Users:
colima start --memory 8
Fix for Podman Users:
Go to Podman Desktop settings and adjust memory to above 8GB
You may also need to run:
podman machine set --memory 8192
Fresh Start for Local Development¶
Error: Need to clean restart local development environment
Fix using Yarn:
yarn clean:docker
Fix using Docker Compose:
docker-compose down --volumes --remove-orphans && docker network prune
Service Connectivity Issues¶
Error: Cannot connect to services in local development
Fix:
Ensure you’re connected to Cisco VPN in your target region
Verify both MetalFly API and Core containers are running
Check that credentials are properly refreshed for both services
AWS Credentials in Containers¶
Error: “Could not load credentials from any providers” when making AWS service calls from MetalFly Core
Fix: The docker compose up command should automatically include credentials. To force inclusion, run this command: docker compose -f docker-compose.override.yml -f docker-compose.yml up. If you still have issues, then you can check their docker logs for the credentials container: docker compose logs ecs-local-endpoints
Memory Issues¶
Build Memory Issues¶
Error:
Bundling with Webpack...
(node:28613) [DEP_WEBPACK_EXTERNALS_FUNCTION_PARAMETERS] DeprecationWarning: The externals-function should be defined like ({context, request}, cb) => { ... }
⠼ Packaging (73s)
<--- Last few GCs --->
[28613:0x7fe54284d000] 81151 ms: Scavenge 4041.9 (4127.2) -> 4037.1 (4127.2) MB, 4.6 / 0.0 ms (average mu = 0.801, current mu = 0.729) allocation failure;
[28613:0x7fe54284d000] 81227 ms: Scavenge 4044.2 (4127.2) -> 4038.7 (4128.0) MB, 8.2 / 0.0 ms (average mu = 0.801, current mu = 0.729) allocation failure;
[28613:0x7fe54284d000] 81308 ms: Scavenge 4045.5 (4128.2) -> 4039.8 (4145.7) MB, 26.6 / 0.0 ms (average mu = 0.801, current mu = 0.729) allocation failure;
<--- JS stacktrace --->
FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
Fix:
export NODE_OPTIONS="--max-old-space-size=8192"
Note: The firefly alias automatically sets NODE_OPTIONS=--max_old_space_size=8172 to prevent memory issues.
Authentication and API Issues¶
Common Error Codes¶
ACCESS_DENIED (403)
Cause: Not providing an authorization token; not providing an authorization header to firefly; an invalid API key; an improperly set-up data provider service using CloudAuth, or if no token exists.
Fix: Check your authorization token and API key. Ensure they are properly formatted and not expired.
INTERNAL (500)
Cause: Default catch-all when the error doesn’t fit in any specified category.
Fix: Check the error details. If not available, view the FireFly logs.
THROTTLED (429)
Cause: API key rate limit exceeded.
Fix: If testing, make fewer requests. If in production, contact the FireFly team to raise your rate limit using this ticket template.
SERVICE_TIMEOUT (408)
Cause: FireFly has reserved 3 seconds to call the upstream data provider service. If the service takes longer than 3 seconds, FireFly abandons the call.
Fix: Retry or investigate if the issue is with the data provider service.
AUTH_INVALID_CLIENT_ID (401) AUTH_INVALID_CUSTOMER_ID (401) AUTH_INVALID_DEVICE_ID (401)
Cause: Incorrect token or incorrect API key.
Fix: Check Client Onboarding documentation, ensure token is formatted correctly and regenerate the token. Tokens are valid for 1 hour maximum.
Authentication Failed in GraphQL Console¶
Error: “Authentication failed” in GraphQL Console
Fix: Make sure you’re signed in to music.amazon.com with the same account, then refresh your token.
Query Failed or Timeout¶
Error: “Query failed” or timeout errors
Fix: Check query syntax (explorer highlights errors) and try simpler queries first.
Insomnia Setup Issues¶
Unable to Resolve deviceId¶
Error: “Unable to resolve deviceId. Check request and/or headers”
Fix:
Check the “Auth” tab of the request
Ensure authentication type is set to “OAuth 2.0” (not “None” or “Inherit from parent”)
Verify all populated fields in the OAuth 2.0 form are correct
Click “Clear OAuth 2 Session”
Click “Fetch Tokens” and follow the popup login flow
OAuth2 Failed to Refresh Token¶
Error: [oauth2] Failed to refresh token url=https://api.amazon.com/auth/o2/token status=400
Fix: This is a known bug in Insomnia. Simply click “Fetch Tokens” again to go through the login flow once more.
Client ID Unset Error¶
Error: “We’re sorry! An error occurred when we tried to process your request. Rest assured, we’re already working on the problem and expect to resolve it shortly”
Fix: This occurs when the Client ID is unset. Ensure it is updated with _.oAuthClientId and try the request again.
Client Secret Invalid¶
Error: Failed to fetch token url=https://api.amazon.com/auth/o2/token status=401 no description provided
Fix: This occurs when the Client Secret is unset or invalid. Ensure it is updated with _.oAuthClientSecret and try the request again.
Invalid Scope Error¶
Error: OAuth 2.0 Error invalid_scope An unknown scope was requested undefined
Fix: Click on the “Auth” tab and under Advanced Options, set the “Scope” field to profile. Try the request again.
Service Integration Issues¶
CloudAuth Service Integration¶
Error: "message": "getaddrinfo ENOTFOUND mis-q1t-vui-na-p-tcp.iad.amazon.com"
Fix:
Check if VPC/CloudAuth endpoints are defined correctly
May have missed required CloudAuth code changes (within amu_webapi or fireflyservicecdk)
May have mismatched dev and prod endpoints
Check connection in the FireFly VPC console
Create a lambda in the same VPC and try to hit the service VPC endpoint to check if connection is established
Error: Request failed with status code 404
Fix:
Check if you missed CloudAuth configurations in the code
Check service/targetModel/path/CloudAuth endpoints defined in the service package
Verify service name in constants.ts matches service name in FF_package
Check if service endpoints require a path to be defined in FF_package
Error: Request failed with status code 401
Fix:
Check AAA relationship between FireFly and the service you are integrating
Check if AuthToken is being generated for the API you are calling in the service
Error: Request failed with status code 500
Fix:
Check if input parameters are correct
Refresh the LWA token
Yarn Package Management Issues¶
Yarn Pack Not Reflecting Changes¶
Error: Deployment not reflecting changes after yarn pack
Fix:
You may need to bump the version number of the client package in
package.jsonunderversionRun
yarn cache cleanIncrement the dependency’s version number to get yarn cache to recognize newly-built changes in the
.tgzfile
Yarn Install Errors with Packed Files¶
Error: Error when running yarn install with packed files
Fix:
Check for spaces after
file:in package.json that need to be removedRun
yarn cache cleanIncrement the dependency’s version number
Testing Issues¶
Integration Test Failures¶
Error: Integration tests failing against deployed endpoints
Fix:
Ensure you have created the
endpoint.envfile in the root directory with your endpoint values:
NA_ENDPOINT=https://REPLACE-WITH-YOUR-NA-ENDPOINT.execute-api.us-east-1.amazonaws.com/dev/
EU_ENDPOINT=https://REPLACE-WITH-YOUR-EU-ENDPOINT.execute-api.eu-west-1.amazonaws.com/dev/
FE_ENDPOINT=https://REPLACE-WITH-YOUR-FE-ENDPOINT.execute-api.us-west-2.amazonaws.com/dev/
Lambda Role Access Issues¶
Error: Token returns null for new client integration
Fix: Have the Devex API Team on-call add your Lambda role to the AAA page for MusicFireFlyService.
Cache Debugging Issues¶
Cache Entries Not Found¶
Error: Cannot find expected cache entries
Fix:
Check other endpoints in the Elasticache cluster, as cache entries may exist in any of them
Try using different wildcards (*, customerId, API) for the search
Check the TTL for the entry - it may be deleting quickly
Make sure you’re using the Elasticache endpoint for the correct region
Verify your code is being hit and not bypassed due to pre-existing cache values
Getting Help¶
Support Channels¶
Slack: #music-firefly-interest
Sage: Firefly Q&A
Office Hours: Schedule time with the team
Additional Resources¶
Main Sage Post on Local Development: FireFly Local Development
Memory Issue Sage Post: Memory Issues
Error Codes Reference: FFError.ts
Status Codes Reference: StatusCodes.ts
