Documentation Index
Fetch the complete documentation index at: https://docs.skyhook.io/llms.txt
Use this file to discover all available pages before exploring further.
This guide helps you diagnose and resolve common issues with Skyhook CI/CD workflows.
Build Failures
Docker Build Fails
Symptoms:
- Build step fails in GitHub Actions
- Error messages about Dockerfile syntax or missing files
- Build context errors
Common causes and solutions:
-
Invalid Dockerfile syntax
Check your Dockerfile for:
- Correct instruction format (FROM, RUN, COPY, etc.)
- Proper line continuation with backslashes
- Valid base image references
-
Missing files or directories
Ensure files referenced in COPY or ADD commands exist:
- Check file paths are relative to build context
- Verify .dockerignore isn't excluding required files
- Confirm files are committed to Git
-
Base image not found
Verify base image exists and is accessible:
- Check image name and tag are correct
- Ensure you have access to private registries
- Try pulling the image locally first
-
Build context too large
Reduce build context size:
- Add node_modules, .git, etc. to .dockerignore
- Remove unnecessary files from repository
- Use multi-stage builds to minimize final image size
Debugging steps:
- Review the full build logs in GitHub Actions
- Try building locally:
docker build -t test .
- Check recent changes to Dockerfile or dependencies
- Verify all required files are in the repository
Image Push Fails
Symptoms:
- Build succeeds but push to registry fails
- Authentication errors to container registry
- “Repository does not exist” errors
Common causes and solutions:
-
Invalid registry credentials
For AWS ECR:
- Verify AWS_DEPLOY_ROLE or AWS credentials are correct
- Check IAM role has ecr:GetAuthorizationToken permission
- Ensure role trust policy includes GitHub OIDC provider
For GCP Artifact Registry:
- Verify WIF_PROVIDER and WIF_SERVICE_ACCOUNT are correct
- Check service account has Artifact Registry Writer role
- Ensure Workload Identity binding is configured
For Azure ACR:
- Verify service principal credentials
- Check service principal has AcrPush role
-
Repository doesn’t exist
AWS ECR: Repositories are created automatically
- Verify IAM permissions include ecr:CreateRepository
GCP/Azure: Create repository manually
- GCP: gcloud artifacts repositories create
- Azure: az acr repository create
-
Registry URL is incorrect
Check registry format in .koala.toml:
- AWS: 123456789.dkr.ecr.us-east-1.amazonaws.com
- GCP: us-docker.pkg.dev/project-id/repository
- Azure: myregistry.azurecr.io
Debugging steps:
- Verify registry URL in workflow logs
- Test authentication manually with cloud CLI tools
- Check registry permissions in cloud console
- Review GitHub secrets configuration
Deployment Failures
CI workflow ignores the ref you picked
Applies to services on older or customized workflow files — current generated workflows declare ref already, so if you haven’t modified yours you won’t hit this.
Symptoms:
- Deploying from a branch works, but the workflow runs against
main instead of the ref you picked in the Skyhook UI
- Skyhook’s deploy triggers a workflow run with no
ref parameter applied
Cause:
Skyhook passes a ref input to your workflow_dispatch call. If your workflow file doesn’t declare that input, GitHub rejects the value silently and falls back to the HEAD of the target branch.
Fix:
The fastest path is to regenerate your workflows with the CLI, which ships the current template (including the ref input):
skyhook update cicd --reset
If you’ve customized the workflow and want to keep your changes, add ref manually:
on:
workflow_dispatch:
inputs:
ref:
description: 'Git ref (branch, tag, or SHA) to build from'
required: false
type: string
jobs:
build:
steps:
- uses: actions/checkout@v6
with:
ref: ${{ inputs.ref || github.ref }}
kubectl Deployment Fails
Symptoms:
- Deployment step fails after successful build
- “connection refused” or “unauthorized” errors
- kubectl commands timeout
Common causes and solutions:
-
Invalid cluster credentials
Verify cluster access:
- Check cluster name and region are correct
- Ensure credentials have cluster admin permissions
- Test connection: kubectl get nodes
-
Cluster name format incorrect
Verify cluster format matches cloud provider:
AWS: aws/123456789/us-east-1/my-cluster
- Account ID, region, cluster name must be exact
GCP: gcp/my-project/us-central1/my-cluster
- Project ID, location, cluster name must match
Azure: azure/subscription-id/eastus/my-cluster
- Subscription ID, region, cluster name must match
-
Insufficient permissions
Ensure service account/role has Kubernetes permissions:
- AWS: IAM role needs eks:DescribeCluster
- GCP: Service account needs container.clusters.get
- Azure: Service principal needs AKS Cluster User
-
Invalid Kubernetes manifests
Check manifest syntax:
- Validate YAML syntax
- Ensure apiVersion is correct for your cluster
- Verify resource names follow Kubernetes naming rules
- Test locally: kubectl apply --dry-run=client -f manifests/
Debugging steps:
- Review kubectl output in workflow logs
- Verify cluster exists and is accessible
- Check Kubernetes manifest syntax
- Test deployment locally with kubectl
ArgoCD Not Syncing
Symptoms:
- Workflow completes but changes don’t appear in cluster
- ArgoCD shows “OutOfSync” status
- Application health degraded
Common causes and solutions:
-
ArgoCD not configured correctly
Verify ArgoCD setup:
- Check ArgoCD is installed in cluster
- Ensure Application resource exists
- Verify repository is connected in ArgoCD
- Check sync policy is configured
-
Repository access issues
Ensure ArgoCD can access deployment repository:
- For private repos: Add deploy key or credentials in ArgoCD
- Verify repository URL is correct
- Check branch name matches ArgoCD Application spec
-
Manifest path incorrect
Check path configuration:
- Verify path in ArgoCD Application matches repo structure
- Ensure kustomization.yaml exists at specified path
- Check environment overlay path is correct
-
Sync policy prevents auto-sync
Check ArgoCD Application sync settings:
- Enable automated sync if desired
- Check for required manual approval
- Review sync options and prune settings
Debugging steps:
- Check ArgoCD UI for application status
- Review ArgoCD application logs:
kubectl logs -n argocd <argocd-server-pod>
- Verify Git commits appear in deployment repository
- Manually trigger sync in ArgoCD UI
- Check ArgoCD application events:
kubectl describe application -n argocd <app-name>
Authentication Issues
AWS Authentication Fails
Symptoms:
- “Unable to locate credentials” error
- “Access denied” when accessing EKS or ECR
- OIDC token validation errors
Common causes and solutions:
-
OIDC provider not configured
Ensure OIDC provider exists in AWS IAM:
- Provider URL: token.actions.githubusercontent.com
- Audience: sts.amazonaws.com
- Verify provider is active
-
IAM role trust policy incorrect
Check role trust policy includes GitHub:
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::ACCOUNT:oidc-provider/token.actions.githubusercontent.com"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"token.actions.githubusercontent.com:sub": "repo:ORG/REPO:ref:refs/heads/main"
}
}
}
-
Missing IAM permissions
Verify role has required policies:
- EKS access: eks:DescribeCluster, eks:ListClusters
- ECR access: ecr:GetAuthorizationToken, ecr:BatchCheckLayerAvailability,
ecr:PutImage, ecr:InitiateLayerUpload, ecr:UploadLayerPart,
ecr:CompleteLayerUpload, ecr:CreateRepository
-
AWS_DEPLOY_ROLE variable not set
Configure in GitHub repository:
- Go to Settings → Secrets and Variables → Actions
- Add variable: AWS_DEPLOY_ROLE
- Value: arn:aws:iam::123456789:role/github-actions-deploy
Debugging steps:
- Verify OIDC provider exists in IAM
- Check role ARN is correct in GitHub variable
- Review IAM role trust policy and permissions
- Test role assumption locally with AWS CLI
GCP Authentication Fails
Symptoms:
- “Permission denied” errors
- “Invalid JWT” or token validation errors
- Cannot access GKE or Artifact Registry
Common causes and solutions:
-
Workload Identity not configured
Verify Workload Identity setup:
- Pool exists: gcloud iam workload-identity-pools describe github-actions
- Provider configured for GitHub OIDC
- Attribute mapping includes repository
-
Service account permissions missing
Ensure service account has required roles:
- GKE access: roles/container.developer
- Artifact Registry: roles/artifactregistry.writer
- Basic: roles/iam.serviceAccountTokenCreator
-
Workload Identity binding incorrect
Check service account IAM policy binding:
gcloud iam service-accounts get-iam-policy \
SERVICE_ACCOUNT@PROJECT.iam.gserviceaccount.com
Should include:
- Role: roles/iam.workloadIdentityUser
- Member: principalSet://iam.googleapis.com/projects/PROJECT_NUM/locations/global/workloadIdentityPools/github-actions/attribute.repository/ORG/REPO
-
WIF variables not set correctly
Configure in GitHub repository:
- WIF_PROVIDER: projects/PROJECT_NUM/locations/global/workloadIdentityPools/github-actions/providers/github
- WIF_SERVICE_ACCOUNT: SERVICE_ACCOUNT@PROJECT.iam.gserviceaccount.com
Debugging steps:
- Verify Workload Identity pool and provider exist
- Check service account has necessary roles
- Review Workload Identity binding for repository
- Test authentication locally with gcloud
GitHub Authentication Fails
Symptoms:
- Cannot access deployment repository
- “Resource not accessible by integration” error
- PAT or GitHub App authentication fails
Common causes and solutions:
-
GitHub App not installed
Verify GitHub App installation:
- Check app is installed on target repository
- Review app permissions include repository access
- Ensure app has Contents: Read & Write permission
-
GitHub App credentials incorrect
Check GitHub secrets:
- GH_APP_ID: Numeric app ID (not app name)
- GH_APP_PK: Complete private key including headers
-----BEGIN RSA PRIVATE KEY-----
...
-----END RSA PRIVATE KEY-----
-
PAT lacks required permissions
If using Personal Access Token:
- Ensure PAT has 'repo' scope
- Check PAT hasn't expired
- Verify user has access to deployment repository
-
Cross-organization access
For accessing repos in different organizations:
- GitHub App must be installed in target org
- PAT user must be member of target org
- Check organization SSO requirements
Debugging steps:
- Verify GitHub App installation and permissions
- Check secret values are complete and correct
- Test repository access manually
- Review workflow logs for specific error messages
Common Error Messages
”ImagePullBackOff” in Kubernetes
Cause: Kubernetes cannot pull the Docker image
Solutions:
- Verify image tag exists in registry
- Check image name and registry URL are correct
- Ensure Kubernetes has credentials to access private registry
- For ECR: Verify ECR image pull secret is configured
- Check network connectivity from cluster to registry
”CrashLoopBackOff” in Kubernetes
Cause: Container starts but immediately crashes
Solutions:
- Check application logs:
kubectl logs <pod-name>
- Verify environment variables are set correctly
- Ensure required secrets and config maps exist
- Check application dependencies (database, APIs) are accessible
- Review resource limits aren’t too restrictive
”Workflow dispatch failed”
Cause: Cannot trigger GitHub Actions workflow
Solutions:
- Verify workflow file exists in repository
- Check workflow_dispatch trigger is configured
- Ensure you have permission to trigger workflows
- Review workflow inputs match expected parameters
- Check GitHub Actions is enabled for repository
Getting Help
If you’re still experiencing issues:
- Check GitHub Actions logs - Detailed error messages and stack traces
- Review Skyhook documentation - Additional guides and examples
- Verify configuration - Double-check .koala.toml and secrets
- Test components individually - Isolate the failing step
- Contact support - Provide workflow run URL and error details
Next Steps