Skip to main content

· 5 min read
Webber Huang

In this Harvester Knowledge Base article, Ivan Sim provided comprehensive guidance on using Velero to perform backup and restore operations for VMs with external storage in Harvester.

However, in certain scenarios, users may require the VM filesystem to be quiesced during Velero backup creation to prevent data corruption, especially when the VM is experiencing heavy I/O operations.

This article describes how to customize Velero Backup Hooks to implement filesystem freeze during Velero backup processing, ensuring data consistency in the backup content.

Background Knowledge

KubeVirt's virt-freezer provides a mechanism to freeze and thaw guest filesystems. This capability can be leveraged to ensure filesystem consistency during VM backups. However, certain prerequisites must be met for filesystem freeze/thaw operations to function properly:

Prerequisites for Filesystem Freeze

  • QEMU Guest Agent must be enabled in the guest VM
    • Verify this by checking if the VMI has AgentConnected in its status
  • Guest VM must be properly configured for related libvirt commands
    • When virt-freezer is triggered, KubeVirt communicates with the QEMU Guest Agent via libvirt commands such as guest-fsfreeze-freeze
    • The guest agent translates these commands to OS-specific calls:
      • Linux systems: Uses fsfreeze syscalls
      • Windows systems: Uses VSS (Volume Shadow Copy Service) APIs

Common Configuration Challenges

Based on Harvester project experience, some guest operating systems require additional configuration:

  • Linux distributions (e.g., RHEL, SLE Micro): May lack sufficient permissions for filesystem freeze operations by default, requiring custom policies
  • Windows guests: Require the VSS service to be enabled for filesystem freeze functionality

Important: Filesystem freeze/thaw functionality depends on guest VM configuration, which is outside Harvester's control. Users are responsible for ensuring compatibility before implementing Velero backup hooks with filesystem freeze.

Verifying Filesystem Freeze Compatibility

To confirm that your VM supports filesystem freeze operations:

  1. Access the virtual machine's virt-launcher compute container:

    POD=$(kubectl get pods -n <VM Namespace> \
    -l vm.kubevirt.io/name=<VM Name> \
    -o jsonpath='{.items[0].metadata.name}')
    kubectl exec -it $POD -n default -c compute -- bash
  2. Test filesystem freeze using the virt-freezer application available in the compute container:

    virt-freezer --freeze --namespace <VM namespace> --name <VM name>
  3. Critical: Always verify the freeze operation result and thaw the VM filesystems before performing any other operations

Prerequisites

All preparation steps outlined in External CSI Storage Backup and Restore With Velero are mandatory, including:

  • Harvester installation and configuration
  • Velero installation and setup
  • S3-compatible storage configuration
  • Proper networking and permissions

Implementing Filesystem Freeze Hooks for VM Backup Consistency

Velero supports pre and post backup hooks that can be integrated with KubeVirt's virt-freezer to ensure filesystem consistency during VM backups.

Configuring VM Template Annotations

For all VMs requiring data consistency, add the following annotations to the VM template:

apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
name: vm-nfs
namespace: demo
spec:
template:
metadata:
annotations:
# These annotations will be applied to the virt-launcher pod
pre.hook.backup.velero.io/command: '["/usr/bin/virt-freezer", "--freeze", "--namespace", "<VM Namespace>", "--name", "<VM Name>"']'
pre.hook.backup.velero.io/container: compute
pre.hook.backup.velero.io/on-error: Fail
pre.hook.backup.velero.io/timeout: 30s

post.hook.backup.velero.io/command: '["/usr/bin/virt-freezer", "--unfreeze", "--namespace", "<VM Namespace>", "--name", "<VM Name>"]'
post.hook.backup.velero.io/container: compute
post.hook.backup.velero.io/timeout: 30s
spec:
# ...rest of VM spec...

These annotations will be propagated to the related virt-launcher pod and instruct Velero to:

  • Freeze the VM filesystem before backup creation begins
  • Thaw the VM filesystem after backup completion

Important: Replace <VM Namespace> and <VM Name> with the actual namespace and name of your VM.

Creating a Velero Backup with Filesystem Freeze

After applying the Velero pre/post hook annotations to the VM manifest, follow the backup procedures described in External CSI Storage Backup and Restore With Velero.

Verifying Successful Hook Execution

If the guest VM is configured correctly, the Velero backup will complete successfully with HooksAttempted indicating successful hook execution.

Check the backup status using:

velero backup describe [Backup Name] --details

Example output showing successful hook execution:

Name:         demo
Namespace: velero
Labels: velero.io/storage-location=default
Annotations: velero.io/resource-timeout=10m0s
velero.io/source-cluster-k8s-gitversion=v1.33.3+rke2r1
velero.io/source-cluster-k8s-major-version=1
velero.io/source-cluster-k8s-minor-version=33

Phase: Completed


Namespaces:
Included: demo
Excluded: <none>

Resources:
Included: *
Excluded: <none>
Cluster-scoped: auto

Label selector: <none>

Or label selector: <none>

Storage Location: default

Velero-Native Snapshot PVs: auto
Snapshot Move Data: true
Data Mover: velero

....

Backup Volumes:
Velero-Native Snapshots: <none included>

CSI Snapshots:
demo/vm-nfs-disk-0-au2ej:
Data Movement:
Operation ID: du-be5417aa-498e-4b93-b59f-e6498f95a6df.d7f97dab-3bb1-41e189381
Data Mover: velero
Uploader Type: kopia
Moved data Size (bytes): 5368709120
Result: succeeded

Pod Volume Backups: <none included>

HooksAttempted: 2
HooksFailed: 0

The output shows that Velero pre/post backup hooks completed successfully. In this case, the hooks are connected to guest VM filesystem freeze and thaw operations to ensure data consistency.

Restoring the Velero Backup

Follow the restoration procedures described in External CSI Storage Backup and Restore With Velero to restore the namespace using Velero.

Troubleshooting

If you encounter issues with filesystem freeze operations:

  1. Verify QEMU Guest Agent status in the VMI
  2. Check guest OS configuration for filesystem freeze support
  3. Review Velero hook logs for specific error messages
  4. Test virt-freezer manually as described in the verification section

Conclusion

Implementing filesystem freeze hooks with Velero ensures data consistency during VM backups by quiescing the filesystem before snapshot creation. This approach is particularly valuable for VMs with high I/O activity or critical data that requires point-in-time consistency guarantees.

· 3 min read
Jack Yu

Problem Description

For Harvester to successfully migrate a virtual machine from one node to another, the source and target nodes must have compatible CPU models and features.

If the CPU model of a virtual machine isn't specified, KubeVirt assigns it the default host-model configuration so that the virtual machine has the CPU model closest to the one used on the host node.

KubeVirt automatically adjusts the node selectors of the associated virt-launcher Pod based on this configuration. If the CPU models and features of the source and target nodes do not match, the live migration may fail.

Let's examine an example.

When a virtual machine is first migrated to another node with the SierraForest CPU model, the following key-value pairs are added to the spec.nodeSelector field in the Pod spec.

spec:
nodeSelector:
cpu-model-migration.node.kubevirt.io/SierraForest: "true"
cpu-feature.node.kubevirt.io/fpu: "true"
cpu-feature.node.kubevirt.io/vme: "true"

The above nodeSelector configuration is retained for subsequent migrations, which may fail if the new target node doesn't have the corresponding features or model.

For example, compare the CPU model and feature labels added by KubeVirt to the following two nodes:

# Node A
labels:
cpu-model-migration.node.kubevirt.io/SierraForest: "true"
cpu-feature.node.kubevirt.io/fpu: "true"
cpu-feature.node.kubevirt.io/vme: "true"

# Node B
labels:
cpu-model-migration.node.kubevirt.io/SierraForest: "true"
cpu-feature.node.kubevirt.io/vme: "true"

This virtual machine will fail to migrate to Node B due to the missing fpu feature. However, if the virtual machine doesn't actually require this feature, this can be frustrating. Therefore, setting up a common CPU model can resolve this issue.

How to Set Up a Common CPU Model

You can define a custom CPU model to ensure that the spec.nodeSelector configuration in the Pod spec is assigned a CPU model that is compatible and common to all nodes in the cluster.

Consider this example.

We have the following node information:

# Node A
labels:
cpu-model.node.kubevirt.io/IvyBridge: "true"
cpu-feature.node.kubevirt.io/fpu: "true"
cpu-feature.node.kubevirt.io/vme: "true"

# Node B
labels:
cpu-model.node.kubevirt.io/IvyBridge: "true"
cpu-feature.node.kubevirt.io/vme: "true"

If we set up IvyBridge as our CPU model in the virtual machine spec, KubeVirt only adds cpu-model.node.kubevirt.io/IvyBridge under spec.nodeSelector in the Pod spec.

# Virtual Machine Spec
spec:
template:
spec:
domain:
cpu:
model: IvyBridge

# Pod spec
spec:
nodeSelector:
cpu-model.node.kubevirt.io/IvyBridge: "true"

With this configuration, your virtual machine can be migrated to any node that has the label cpu-model.node.kubevirt.io/IvyBridge.

Set Up Cluster-Wide Configuration

If your virtual machines run only on a specific CPU model, you can set up a cluster-wide CPU model in the kubevirt resource.

You can edit it with kubectl edit kubevirt kubevirt -n harvester-system, then add the CPU model you want in the following spec:

spec:
configuration:
cpuModel: IvyBridge

Then, when a new virtual machine starts or an existing virtual machine restarts, the cluster-wide setting will be applied. The system follows these priorities when using CPU models if you configure them in both locations:

  1. CPU model in the virtual machine spec.
  2. CPU model in the KubeVirt spec.

References

· 2 min read
Masashi Homma

We have the default SSH user rancher, but other users may be required. Creating users with useradd will result in their deletion upon restarting the Harvester node; therefore, follow the steps below to create persistent SSH users.

Public Key Authentication

Create cloud-init.yaml

Create cloud-init.yaml with the following content.

  • Modify the matchSelector if you want to create an SSH user only on specific nodes.
  • #cloud-config should be written exactly as shown.
  • User must be in the admin group.
  • Specify the public key in ssh_authorized_keys.
  • Modify the contents according to your environment.
apiVersion: node.harvesterhci.io/v1beta1
kind: CloudInit
metadata:
name: add-test-user
spec:
matchSelector: {} # applies to all nodes
filename: 99_add_test_user.yaml
contents: |
#cloud-config
users:
- name: test-user
gecos: "admin_user"
groups: [users, admin]
sudo: ALL=(ALL) NOPASSWD:ALL
shell: /bin/bash
ssh_authorized_keys:
- ssh-rsa AAAA.... # <--insert full authorized key here, e.g. from your ~/.ssh/id_rsa.pub file

Password Authentication

Create password hash

Use the following command to create a password hash. Replace test with your actual password.

$ openssl passwd -6 'test'
$6$zF26pcXOS2eaivX8$6ySoTzQC2cToz29mGFC0DuG5cVWTv3Mktc3k/g1KXTtrG2BhsFh8xs3N0zBmNx0D/H4f1W48a45vI1RK8Rzs.0

Create cloud-init.yaml

Create cloud-init.yaml with the following content.

  • Modify the matchSelector if you want to create an SSH user only on specific nodes.
  • #cloud-config should be written exactly as shown.
  • User must be in the admin group.
  • Specify the hash created earlier in passwd.
  • Modify the contents according to your environment.
apiVersion: node.harvesterhci.io/v1beta1
kind: CloudInit
metadata:
name: add-test-user
spec:
matchSelector: {} # applies to all nodes
filename: 99_add_test_user.yaml
contents: |
#cloud-config
users:
- name: test-user
gecos: "admin_user"
groups: [users, admin]
sudo: ALL=(ALL) NOPASSWD:ALL
shell: /bin/bash
lock_passwd: false
passwd: $6$zF26pcXOS2eaivX8$6ySoTzQC2cToz29mGFC0DuG5cVWTv3Mktc3k/g1KXTtrG2BhsFh8xs3N0zBmNx0D/H4f1W48a45vI1RK8Rzs.0

Apply the YAML file

Apply the YAML with this command.

kubectl apply -f cloud-init.yaml

Trailing file will be created.

$ cat /oem/99_add_test_user.yaml
#cloud-config
users:
- name: test-user
gecos: "admin_user"
groups: users, admin
sudo: ALL=(ALL) NOPASSWD:ALL
shell: /bin/bash
...

Reboot Harvester nodes

  1. Run Enable Maintenance Mode on the Harvester UI.
  2. Wait until the state changes to Maintenance.
  3. Reboot the Harvester node.
  4. Run Disable Maintenance Mode on the Harvester UI.

After that, SSH login will be possible.

Tested on: Harvester 1.6.0, 1.5.1.

· One min read
Masashi Homma

If a Harvester node becomes unreachable, Harvester attempts to reschedule its virtual machines to another healthy node. However, this rescheduling doesn't happen immediately. The associated virt-launcher pods may continue to appear to remain in the ready state due to its KubeVirt readiness gate configuration.

To mitigate this elapsed time, you can modify the vm-force-reset-policy setting, by reducing its period value. This enables Harvester to detect non-ready virtual machines on unreachable nodes sooner.

This setting can be found in the Advanced -> Settings page on the Harvester UI.

image

Additionally, while the current default is 5 minutes, we are considering reducing the default value [1].

References

[1] https://github.com/harvester/harvester/issues/8971

· 3 min read
Tim Serong

Harvester allows you to add disks as data volumes. However, only disks that have a World Wide Name (WWN) are displayed on the UI. This occurs because the Harvester node-disk-manager uses the ID_WWN value from udev to uniquely identify disks. The value may not exist in certain situations, particularly when the disks are connected to certain hardware RAID controllers. In these situations, you can view the disks only if you access the host using SSH and run a command such as cat /proc/partitions.

To allow extra disks without WWNs to be visible to Harvester, perform either of the following workarounds:

Workaround 1: Create a filesystem on the disk

caution

Use this method only if the provisioner of the extra disk is Longhorn V1, which is filesystem-based. This method will not work correctly with LVM and Longhorn V2, which are both block device-based.

When you create a filesystem on a disk (for example, using the command mkfs.ext4 /dev/sda), a filesystem UUID is assigned to the disk. Harvester uses this value to identify disks without a WWN.

In Harvester versions earlier than v1.6.0, you can use this workaround for only one extra disk because of a bug in duplicate device checking.

Workaround 2: Add a udev rule for generating fake WWNs

note

This method works with all of the supported provisioners.

You can add a udev rule that generates a fake WWN for each extra disk based on the device serial number. Harvester accepts the generated WWNs because the only requirement is a unique ID_WWN value as presented by udev.

A YAML file containing the necessary udev rule must be created in the /oem directory on each host. This process can be automated across the Harvester cluster using a CloudInit Resource.

  1. Create a YAML file named fake-scsi-wwn-generator.yaml with the following contents:

    apiVersion: node.harvesterhci.io/v1beta1
    kind: CloudInit
    metadata:
    name: fake-scsi-wwn-generator
    spec:
    matchSelector: {}
    filename: 90_fake_scsi_wwn_generator.yaml
    contents: |
    name: "Add udev rules to generate missing SCSI disk WWNs"
    stages:
    initramfs:
    - files:
    - path: /etc/udev/rules.d/59-fake-scsi-wwn-generator.rules
    permissions: 420
    owner: 0
    group: 0
    content: |
    # For anything that looks like a SCSI disk (/dev/sd*),
    # if it has a serial number, but does _not_ have a WWN,
    # create a fake WWN based on the serial number. We need
    # to set both ID_WWN so Harvester's node-disk-manager
    # can see the WWN, and ID_WWN_WITH_EXTENSION which is
    # what 60-persistent-storage.rules uses to generate a
    # /dev/disk/by-id/wwn-* symlink for the device.
    ACTION=="add|change", SUBSYSTEM=="block", KERNEL=="sd*[!0-9]", \
    ENV{ID_SERIAL}=="?*", \
    ENV{ID_WWN}!="?*", ENV{ID_WWN_WITH_EXTENSION}!="?*", \
    ENV{ID_WWN}="fake.$env{ID_SERIAL}", \
    ENV{ID_WWN_WITH_EXTENSION}="fake.$env{ID_SERIAL}"
  2. Apply the file's contents to the cluster by running the command kubectl apply -f fake-scsi-wwn-generator.yaml.

    The file /oem/90_fake_scsi_wwn_generator.yaml is automatically created on all cluster nodes.

  3. Reboot all nodes to apply the new udev rule.

Once the rule is applied, you should be able to view and add extra disks that were previously not visible on the Harvester UI.

References

· 8 min read
Ivan Sim

Harvester 1.5 introduces support for the provisioning of virtual machine root volumes and data volumes using external Container Storage Interface (CSI) drivers.

This article demonstrates how to use Velero 1.16.0 to perform backup and restore of virtual machines in Harvester.

It goes through commands and manifests to:

  • Back up virtual machines in a namespace, their NFS CSI volumes, and associated namespace-scoped configuration
  • Export the backup artifacts to an AWS S3 bucket
  • Restore to a different namespace on the same cluster
  • Restore to a different cluster

Velero is a Kubernetes-native backup and restore tool that enables users to perform scheduled and on-demand backups of virtual machines to external object storage providers such as S3, Azure Blob, or GCS, aligning with enterprise backup and disaster recovery practices.

note

The commands and manifests used in this article are tested with Harvester 1.5.1.

The CSI NFS driver and Velero configuration and versions used are for demonstration purposes only. Adjust them according to your environment and requirements.

important

The examples provided are intended to backup and restore Linux virtual machine workloads. It is not suitable for backing up guest clusters provisioned via the Harvester Rancher integration.

To backup and restore guest clusters like RKE2, please refer to the distro official documentation.

Harvester Installation

Refer to the Harvester documentation for installation requirements and options.

The kubeconfig file of the Harvester cluster can be retrieved following the instructions here.

Install and Configure Velero

Download the Velero CLI.

Set the following shell variables:

BUCKET_NAME=<your-s3-bucket-name>
BUCKET_REGION=<your-s3-bucket-region>
AWS_CREDENTIALS_FILE=<absolute-path-to-your-aws-credentials-file>

Install Velero on the Harvester cluster:

velero install \
--provider aws \
--features=EnableCSI \
--plugins "velero/velero-plugin-for-aws:v1.12.0,quay.io/kubevirt/kubevirt-velero-plugin:v0.7.1" \
--bucket "${BUCKET_NAME}" \
--secret-file "${AWS_CREDENTIALS_FILE}" \
--backup-location-config region="${BUCKET_REGION}" \
--snapshot-location-config region="${BUCKET_REGION}" \
--use-node-agent
  • In this setup, Velero is configured to:

    • Run in the velero namespace
    • Enable CSI volume snapshot APIs
    • Enable the built-in node agent data movement controllers and pods
    • Use the velero-plugin-for-aws plugin to manage interactions with the S3 object store
    • Use the kubevirt-velero-plugin plugin to backup and restore KubeVirt resources

Confirm that Velero is installed and running:

kubectl -n velero get po
NAME                      READY   STATUS    RESTARTS         AGE
node-agent-875mr 1/1 Running 0 1d
velero-745645565f-5dqgr 1/1 Running 0 1d

Configure the velero CLI to output the backup and restore status of CSI objects:

velero client config set features=EnableCSI

Deploy the NFS CSI and Example Server

Follow the instructions in the NFS CSI documentation to set up the NFS CSI driver, its storage class, and an example NFS server.

The NFS CSI volume snapshotting capability must also be enabled following the instructions here.

Confirm that the NFS CSI and example server are running:

kubectl get po -A -l 'app in (csi-nfs-node,csi-nfs-controller,nfs-server)'
NAMESPACE     NAME                                  READY   STATUS    RESTARTS    AGE
default nfs-server-b767db8c8-9ltt4 1/1 Running 0 1d
kube-system csi-nfs-controller-5bf646f7cc-6vfxn 5/5 Running 0 1d
kube-system csi-nfs-node-9z6pt 3/3 Running 0 1d

The default NFS CSI storage class is named nfs-csi:

kubectl get sc nfs-csi
NAME      PROVISIONER      RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
nfs-csi nfs.csi.k8s.io Delete Immediate true 14d

Confirm that the default NFS CSI volume snapshot class csi-nfs-snapclass is also installed:

kubectl get volumesnapshotclass csi-nfs-snapclass
NAME                DRIVER           DELETIONPOLICY   AGE
csi-nfs-snapclass nfs.csi.k8s.io Delete 14d

Preparing the Virtual Machine and Image

Create a custom namespace named demo-src:

kubectl create ns demo-src

Follow the instructions in the Image Management documentation to upload the Ubuntu 24.04 raw image from https://cloud-images.ubuntu.com/minimal/releases/noble/ to Harvester.

The storage class of the image must be set to nfs-csi, per the Third-Party Storage Support documentation.

Confirm the virtual machine image is successfully uploaded to Harvester:

image

Follow the instructions in the third-party storage documentation to create a virtual machine with NFS root and data volumes, using the image uploaded in the previous step.

For NFS CSI snapshot to work, the NFS data volume must have the volumeMode set to Filesystem: image

optional

For testing purposes, once the virtual machine is ready, access it via SSH and add some files to both the root and data volumes.

The data volume needs to be partitioned, with a file system created and mounted before files can be written to it.

Backup the Source Namespace

Use the velero CLI to create a backup of the demo-src namespace using Velero's built-in data mover:

BACKUP_NAME=backup-demo-src-`date "+%s"`

velero backup create "${BACKUP_NAME}" \
--include-namespaces demo-src \
--snapshot-move-data
info

For more information on Velero's data mover, see its documentation on CSI data snapshot movement capability.

This creates a backup of the demo-src namespace containing resources like the virtual machine created earlier, its volumes, secrets and other associated configuration.

Depending on the size of the virtual machine and its volumes, the backup may take a while to complete.

The DataUpload custom resources provide insights into the backup progress:

kubectl -n velero get datauploads -l velero.io/backup-name="${BACKUP_NAME}"

Confirm that the backup completed successfully:

velero backup get "${BACKUP_NAME}"
NAME                         STATUS      ERRORS   WARNINGS   CREATED                         EXPIRES   STORAGE LOCATION   SELECTOR
backup-demo-src-1747954979 Completed 0 0 2025-05-22 16:04:46 -0700 PDT 29d default <none>

After the backup completes, Velero removes the CSI snapshots from the storage side to free up the snapshot data space.

tips

The velero backup describe and velero backup logs commands can be used to assess details of the backup including resources included, skipped, and any warnings or errors encountered during the backup process.

Restore To A Different Namespace

This section describes how to restore the backup from the demo-src namespace to a new namespace named demo-dst.

Save the following restore modifier to a local file named modifier-data-volumes.yaml:

cat <<EOF > modifier-data-volumes.yaml
version: v1
resourceModifierRules:
- conditions:
groupResource: persistentvolumeclaims
matches:
- path: /metadata/annotations/harvesterhci.io~1volumeForVirtualMachine
value: "\"true\""
patches:
- operation: remove
path: /metadata/annotations/harvesterhci.io~1volumeForVirtualMachine
EOF

This restore modifier removes the harvesterhci.io/volumeForVirtualMachine annotation from the virtual machine data volumes to ensure that the restoration do not conflict with the CDI volume import populator.

Create the restore modifier:

kubectl -n velero create cm modifier-data-volumes --from-file=modifier-data-volumes.yaml

Assign the backup name to a shell variable:

BACKUP_NAME=backup-demo-src-1747954979

Start the restore operation:

velero restore create \
--from-backup "${BACKUP_NAME}" \
--namespace-mappings "demo-src:demo-dst" \
--exclude-resources "virtualmachineimages.harvesterhci.io" \
--resource-modifier-configmap "modifier-data-volumes" \
--labels "velero.kubevirt.io/clear-mac-address=true,velero.kubevirt.io/generate-new-firmware-uuid=true"
  • During the restore:

    • The virtual machine MAC address and firmware UUID are reset to avoid potential conflicts with existing virtual machines.
    • the virtual machine image manifest is excluded because Velero restores the entire state of the virtual machine from the backup.
    • the modifier-data-volumes restore modifier is invoked to modify the virtual machine data volumes metadata to prevent conflicts with the CDI volume import populator.

While the restore operation is still in-progress, the DataDownload custom resources can be used to examine the progress of the operation:

RESTORE_NAME=backup-demo-src-1747954979-20250522164015

kubectl -n velero get datadownload -l velero.io/restore-name="${RESTORE_NAME}"

Confirm that the restore completed successfully:

velero restore get
NAME                                        BACKUP                       STATUS      STARTED                         COMPLETED                       ERRORS   WARNINGS   CREATED                         SELECTOR
backup-demo-src-1747954979-20250522164015 backup-demo-src-1747954979 Completed 2025-05-22 16:40:15 -0700 PDT 2025-05-22 16:40:49 -0700 PDT 0 6 2025-05-22 16:40:15 -0700 PDT <none>

Verify that the virtual machine and its configuration are restored to the new demo-dst namespace:

image

note

Velero uses Kopia as its default data mover. This issue describes some of its limitations on advanced file system features such as setuid/gid, hard links, mount points, sockets, xattr, ACLs, etc.

Velero provides the --data-mover option to configure custom data movers to satisfy different use cases. For more information, see the Velero's documentation.

tips

The velero restore describe and velero restore logs commands provide more insights into the restore operation including the resources restored, skipped, and any warnings or errors encountered during the restore process.

Restore To A Different Cluster

This section extends the above scenario to demonstrate the steps to restore the backup to a different Harvester cluster.

On the target cluster, install Velero, and set up the NFS CSI and NFS server following the instructions from the Deploy the NFS CSI and Example Server section.

Once Velero is configured to use the same backup location as the source cluster, it automatically discovers the available backups:

velero backup get
NAME                         STATUS      ERRORS   WARNINGS   CREATED                         EXPIRES   STORAGE LOCATION   SELECTOR
backup-demo-src-1747954979 Completed 0 0 2025-05-22 16:04:46 -0700 PDT 29d default <none>

Follow the steps in the Restore To A Different Namespace section to restore the backup on the target cluster.

Remove the --namespace-mappings option to set the restored namespace to demo-src on the target cluster.

Confirm that the virtual machine and its configuration are restored to the demo-src namespace:

image

Select Longhorn Volume Snapshot Class

To perform Velero backup and restore of virtual machines with Longhorn volumes, label the Longhorn volume snapshot class longhorn as follows:

kubectl label volumesnapshotclass longhorn velero.io/csi-volumesnapshot-class

This helps Velero to find the correct Longhorn snapshot class to use during backup and restore.

Limitations

Enhancements related to the limitations described in this section are tracked at https://github.com/harvester/harvester/issues/8367.

  • By default, Velero only supports resource filtering by resource groups and labels. In order to backup/restore a single instance of virtual machine, custom labels must be applied to the virtual machine, and its virtual machine instance, pod, data volumes, persistent volumes claim, persistent volumes and cloudinit secret resources. It's recommended to backup the entire namespace and perform resource filtering during restore to ensure that backup contains all the dependency resources required by the virtual machine.

  • The restoration of virtual machine image is not fully supported yet.

· 4 min read

Users wishing to prevent privilege escalation and other security issues can leverage Kubernetes' Pod Security Standards (PSS) on Harvester. PSS are a set of security policies that can be applied to clusters and namespaces to control and restrict how workloads are executed.

Pod Security Standards in Harvester can be used when provisioning VM workloads and also with the new experimental support for running baremetal container workloads.

The baseline policy is aimed at ease of adoption for common containerized workloads while preventing known privilege escalations. This policy is targeted at application operators and developers of non-critical applications.

warning

VMs with device passthrough, such as pcidevices, usbdevices and vgpudevices, will fail to start with baseline policy, as they need SYS_RESOURCE capability. This is being tracked on issue #8218. A fix should be available for this shortly.

Namespace level enablement

To enable PSS a user simply needs to label their workload namespaces as follows:

kubectl label --overwrite ns <namespace>  pod-security.kubernetes.io/enforce=baseline
note

Do not apply PSS to the system's namespaces, as they need privileged permissions to manage cluster resources. Only trusted users must have access to system's namespaces.

Cluster scoped enablement

Cluster wide PSS can be enabled by passing an Admission Control configuration via kube-apiserver arguments. This can be done via Harvester's CloudInit using the following configuration which can be saved to cloudinit-pss.yaml file:

apiVersion: node.harvesterhci.io/v1beta1
kind: CloudInit
metadata:
name: cluster-wide-pss-enforcement
spec:
matchSelector:
node-role.kubernetes.io/control-plane: "true"
filename: 99-pss.yaml
contents: |
stages:
initramfs:
- name: "setup harvester pss"
directories:
- path: /etc/rancher/rke2/config
owner: 0
group: 0
permissions: 384
files:
- content: |
kube-apiserver-arg:
- "admission-control-config-file=/etc/rancher/rke2/config/harvester-pss.yaml"
path: /etc/rancher/rke2/config.yaml.d/99-harvester-pss.yaml
permissions: 384
owner: 0
group: 0
- content: |
apiVersion: apiserver.config.k8s.io/v1
kind: AdmissionConfiguration
plugins:
- name: PodSecurity
configuration:
apiVersion: pod-security.admission.config.k8s.io/v1
kind: PodSecurityConfiguration
defaults:
enforce: "baseline"
enforce-version: "latest"
audit: "baseline"
audit-version: "latest"
warn: "baseline"
warn-version: "latest"
exemptions:
usernames: []
runtimeClasses: []
namespaces: [calico-apiserver,
calico-system,
cattle-alerting,
cattle-csp-adapter-system,
cattle-elemental-system,
cattle-epinio-system,
cattle-externalip-system,
cattle-fleet-local-system,
cattle-fleet-system,
cattle-gatekeeper-system,
cattle-global-data,
cattle-global-nt,
cattle-impersonation-system,
cattle-istio,
cattle-istio-system,
cattle-logging,
cattle-logging-system,
cattle-monitoring-system,
cattle-neuvector-system,
cattle-prometheus,
cattle-provisioning-capi-system,
cattle-resources-system,
cattle-sriov-system,
cattle-system,
cattle-ui-plugin-system,
cattle-windows-gmsa-system,
cert-manager,
cis-operator-system,
fleet-default,
ingress-nginx,
istio-system,
kube-node-lease,
kube-public,
kube-system,
longhorn-system,
rancher-alerting-drivers,
security-scan,
tigera-operator,
harvester-system,
harvester-public,
rancher-vcluster]
path: /etc/rancher/rke2/config/harvester-pss.yaml
permissions: 384
owner: 0
group: 0
paused: false

The cluster admin can apply this against the Harvester cluster using kubectl apply -f cloudinit-pss.yaml. The change requires a restart of the control plane nodes to ensure that the Elemental cloud-init directives are applied on boot. Once control plane nodes are rebooted, a default baseline pod security standard will be enforced against all current and subsequently created namespaces. The namespaces listed under exemptions will be skipped. Users are free to tweak the list, to better suit their use cases.

Security considerations

note

For future integration of Pod Security Admission (PSA) configuration natively in Harvester, please verify the progress of issue #8196.

Post application of a default PSS, end users, with permissions to create and edit namespaces, may still be able to override the respective policy by labeling their namespaces to support privileged workloads, for example, as follows:

kubectl label --overwrite ns <namespace> pod-security.kubernetes.io/enforce=privileged

To avoid this, we recommend users to create custom RBACs restricting who can create/update namespaces or to also deploy a Validating Admission Policy. The following policy will block namespace create/update requests containing a label pod-security.kubernetes.io/enforce, there by preventing namespace admins from changing the settings for their namespace.

apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
name: namespace-pss-label-rejection
spec:
failurePolicy: Fail
matchConstraints:
resourceRules:
- apiGroups: [""]
apiVersions: ["v1"]
operations: ["CREATE", "UPDATE"]
resources: ["namespaces"]
validations:
- expression: |
!("pod-security.kubernetes.io/enforce" in object.metadata.labels)
---
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicyBinding
metadata:
name: namespace-pss-label-rejection-binding
spec:
policyName: namespace-pss-label-rejection
validationActions: [Deny]

In case more tailored policies are needed, users can rely on security policy engines like Kubewarden's policy PSA Label Enforcer, or similar solution, to ensure that namespaces have the required PSS configuration for deployment in the cluster.

· 3 min read
Ivan Sim
important

CVE-2025-1974 (vector: CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H) has a score of 9.8 (Critical).

The vulnerability affects specific versions of the RKE2 ingress-nginx controller (v.1.11.4 and earlier, and v1.12.0). All Harvester versions that use this controller (including v1.4.2 and earlier) are therefore affected.

This CVE is fixed in Harvester 1.5.0, 1.4.3 and newer.

A security issue was discovered in Kubernetes where under certain conditions, an unauthenticated attacker with access to the pod network can achieve arbitrary code execution in the context of the ingress-nginx controller. This can lead to disclosure of secrets accessible to the controller. (Note that in the default installation, the controller can access all secrets cluster-wide.)

You can confirm the version of the RKE2 ingress-nginx pods by running this command on your Harvester cluster:

kubectl -n kube-system get po -l"app.kubernetes.io/name=rke2-ingress-nginx" -ojsonpath='{.items[].spec.containers[].image}'

If the command returns one of the affected versions, disable the rke2-ingress-nginx-admission validating webhook configuration by performing the following steps:

  1. On one of your control plane nodes, use kubectl to confirm the existence of the HelmChartConfig resource named rke2-ingress-nginx:

    $ kubectl -n kube-system get helmchartconfig rke2-ingress-nginx
    NAME AGE
    rke2-ingress-nginx 14d1h
  2. Use kubectl -n kube-system edit helmchartconfig rke2-ingress-nginx to add the following configurations to the resource:

    • .spec.valuesContent.controller.admissionWebhooks.enabled: false
    • .spec.valuesContent.controller.extraArgs.enable-annotation-validation: true
  3. The following is an example of what the updated .spec.valuesContent configuration along with the default Harvester ingress-nginx configuration should look like:

    apiVersion: helm.cattle.io/v1
    kind: HelmChartConfig
    metadata:
    name: rke2-ingress-nginx
    namespace: kube-system
    spec:
    valuesContent: |-
    controller:
    admissionWebhooks:
    port: 8444
    enabled: false
    extraArgs:
    enable-annotation-validation: true
    default-ssl-certificate: cattle-system/tls-rancher-internal
    config:
    proxy-body-size: "0"
    proxy-request-buffering: "off"
    publishService:
    pathOverride: kube-system/ingress-expose

    Exit the kubectl edit command execution to save the configuration.

    Harvester automatically applies the change once the content is saved.

    important

    The configuration disables the RKE2 ingress-nginx admission webhooks while preserving Harvester's default ingress-nginx configuration.

    If the HelmChartConfig resource contains other custom ingress-nginx configuration, you must retain them when editing the resource.

  4. Verify that RKE2 deleted the rke2-ingress-nginx-admission validating webhook configuration.

    $ kubectl get validatingwebhookconfiguration rke2-ingress-nginx-admission
    Error from server (NotFound): validatingwebhookconfigurations.admissionregistration.k8s.io "rke2-ingress-nginx-admission" not found
  5. Verify that the ingress-nginx pods are restarted successfully.

    $ kubectl -n kube-system get po -lapp.kubernetes.io/instance=rke2-ingress-nginx
    NAME READY STATUS RESTARTS AGE
    rke2-ingress-nginx-controller-g8l49 1/1 Running 0 5s

Once your Harvester cluster receives the RKE2 ingress-nginx patch, you can re-install the rke2-ingress-nginx-admission validating webhook configuration by removing the HelmChartConfig patch.

important

These steps only cover the RKE2 ingress-nginx controller that is managed by Harvester. You must also update other running ingress-nginx controllers. See the References section for more information.

References

· One min read
Tim Serong

The ISO image may fail to boot when you attempt to install Harvester on a host with the following characteristics:

  • An operating system was previously installed, particularly openSUSE Leap 15.5 or later and Harvester v1.3.1 or later. Other Linux distributions and recent versions of Windows may also be affected.
  • UEFI secure boot is enabled.

This issue occurs when the Harvester ISO uses a shim bootloader that is older than the bootloader previously installed on the host. For example, the Harvester v1.3.1 ISO uses shim 15.4 but the system uses shim 15.8 after installation, which sets SBAT revocations for older shims. Subsequent attempts to boot the older shim on the ISO fail with the following error:

Verifying shim SBAT data failed: Security Policy Violation
Something has gone seriously wrong: SBAT self-check failed: Security Policy Violation

To mitigate the issue, perform the following workaround:

  1. Disable Secure Boot.
  2. Boot the ISO image and proceed with the installation.
  3. Enable Secure Boot and boot into the installed system.

References

· 2 min read
Cooper Tseng

Harvester's embedded Rancher UI may display warnings about expiring KubeVirt certificates. You can safely ignore these warnings because automatic certificate rotation is handled by KubeVirt and is enabled by default.

kubevirt-certs-expired

KubeVirt Certificate Rotation Strategy

KubeVirt provides a self-signed certificate mechanism that rotates both CA and certifcates on a defined recurring interval. You can check the setting certificateRotateStrategy by running the following command:

kubectl get kubevirt -n harvester-system -o yaml

By default, the value of certificateRotateStrategy is empty, which means that KubeVirt uses its default rotation settings and no manual configuration is required.

certificateRotateStrategy: {}

Configuration Fields

You can use the following fields to configure certificateRotateStrategy.

  • .ca.duration: Validity period of the CA certificate. The default value is "168h".
  • .ca.renewBefore: Amount of time before a CA certificate expires during which a new certificate is issued. The default value is "33.6h".
  • .server.duration: Validity period of server component certificates (for example, virt-api, virt-handler, and virt-operator). The default value is "24h".
  • .server.renewBefore: Amount of time before a server certificate expires during which a new certificate is issued. The default value is "4.8h".

Example of a complete configuration:

certificateRotateStrategy:
selfSigned:
ca:
duration: 168h
renewBefore: 33.6h
server:
duration: 24h
renewBefore: 4.8h

Certificate Rotation Triggers

Certificate rotation can be triggered by several conditions. The following list only outlines key triggers and is not exhaustive.

  • Missing certificate: A required certificate does not exist.
  • Invalid CA signature: A certificate was not signed by the specified CA.
  • Proactive renewal: The renewBefore value takes effect. A new certificate must be issued before the current one expires.
  • CA expiration: The CA certificate has expired, so the certificate signed by the CA is also rotated.

When certificate rotation is triggered, you should see virt-operator log records similar to the following:

{"component":"virt-operator","level":"info","msg":"secret kubevirt-virt-api-certs updated","pos":"core.go:278","timestamp":"2024-12-06T08:02:01.045809Z"}
{"component":"virt-operator","level":"info","msg":"secret kubevirt-controller-certs updated","pos":"core.go:278","timestamp":"2024-12-06T08:02:01.056759Z"}
{"component":"virt-operator","level":"info","msg":"secret kubevirt-exportproxy-certs updated","pos":"core.go:278","timestamp":"2024-12-06T08:02:01.063530Z"}
{"component":"virt-operator","level":"info","msg":"secret kubevirt-virt-handler-server-certs updated","pos":"core.go:278","timestamp":"2024-12-06T08:02:01.068608Z"}
{"component":"virt-operator","level":"info","msg":"secret kubevirt-virt-handler-certs updated","pos":"core.go:278","timestamp":"2024-12-06T08:02:01.074555Z"}
{"component":"virt-operator","level":"info","msg":"secret kubevirt-operator-certs updated","pos":"core.go:278","timestamp":"2024-12-06T08:02:01.078719Z"}
{"component":"virt-operator","level":"info","msg":"secret kubevirt-export-ca updated","pos":"core.go:278","timestamp":"2024-12-06T08:03:36.063496Z"}
{"component":"virt-operator","level":"info","msg":"secret kubevirt-ca updated","pos":"core.go:278","timestamp":"2024-12-06T08:04:06.052750Z"}

References