📝 Author

Birat Aryal — birataryal.github.io Created Date: 2026-03-11
Updated Date: Wednesday 11th March 2026 00:13:06
Website - birataryal.com.np
Repository - Birat Aryal
LinkedIn - Birat Aryal
DevSecOps Engineer | System Engineer | Cyber Security Analyst | Network Engineer

Cluster API initialization fails

Symptoms

clusterctl init fails or controllers do not start.

Possible causes

Incorrect provider configuration
Internet access issues
Invalid clusterctl.yaml

Checks

Bash

clusterctl config repositories  
kubectl get pods -A

VMs are not created in vCenter

Symptoms

Cluster YAML applied but no VMs appear.

Possible causes

Incorrect template name
Invalid resource pool
Incorrect folder path
Wrong datastore

Checks

Bash

kubectl get machines -A  
kubectl get vspheremachines -A  
kubectl -n capv-system logs deploy/capv-controller-manager

Verify vCenter objects:

Bash

govc ls "/${VSPHERE_DATACENTER}/vm"

VM created but node never joins cluster

Symptoms

VM boots but does not appear in Kubernetes nodes.

Possible causes

cloud-init failure
missing gateway
DNS misconfiguration
kubeadm failure

Check inside node

Bash

cloud-init status  
journalctl -u kubelet -xe  
ip route  
cat /etc/resolv.conf

Workers stuck in pending state

Symptoms

Bash

kubectl get machines

shows workers stuck in Provisioning.

Possible causes

Control plane unreachable
API VIP unreachable
network misconfiguration

Checks

Bash

clusterctl describe cluster <cluster-name>  
kubectl get kubeadmcontrolplanes

Cluster unreachable

Symptoms

CAPV logs show:

cluster is not reachable: connect: no route to host

Possible causes

missing default route
incorrect gateway
kube-vip not running

Checks

Bash

ip route  
ping <gateway>  
curl -k https://<VIP>:6443/healthz

Calico does not start

Symptoms

Pods in calico-system not running.

Possible causes

wrong pod CIDR
invalid manifest patch
CRS label mismatch

Checks

Bash

kubectl get clusterresourcesets  
kubectl get configmap calico-manifest  
kubectl get pods -n calico-system

vSphere CPI ImagePullBackOff

Symptoms

ImagePullBackOff

Cause

Incorrect CPI version.

Fix

Ensure version matches Kubernetes version.

Example:

registry.k8s.io/cloud-pv-vsphere/cloud-provider-vsphere:v1.30.0

Cluster deletion stuck

Symptoms

Cluster resources remain after deletion.

Possible causes

finalizers
orphaned VMs

Fix

Remove finalizers carefully:

Bash

kubectl patch vspheremachine <name> -p '{"metadata":{"finalizers":[]}}'

Warning

Removing finalizers should only be used when normal deletion fails.