Quantum Kernel SVM — Classification in Hilbert Space
What You'll Learn:
- How quantum kernels K(x,y) = |⟨φ(x)|φ(y)⟩|² enable SVMs to classify data in exponentially large Hilbert spaces
- Why the ZZ feature map creates entanglement between encoded features, capturing non-linear correlations
- How the SVM dual problem uses quantum kernel matrices to find maximum-margin decision boundaries
- When quantum kernels provably outperform classical kernels (and when they don't)
Level: Advanced | Time: 30 minutes | Qubits: 4 | Framework: PennyLane
Prerequisites
- Quantum Kernel — kernel trick, feature maps, kernel matrices
- Fidelity Kernel — inversion test, SWAP test, kernel evaluation
- Bell State — entanglement, measurement basics
The Idea
Classical SVMs find the hyperplane that best separates two classes of data. The kernel trick lets them work in high-dimensional feature spaces without explicitly computing the transformation. A quantum kernel replaces the classical kernel with a quantum circuit: encode data into a quantum state, then measure the overlap between states.
Why quantum? A quantum state on n qubits lives in a 2^n-dimensional Hilbert space. The ZZ feature map encodes pairwise feature products into entanglement phases, creating correlations that are hard to represent classically. For certain data distributions, this gives a provable exponential advantage over any classical kernel.
Think of it this way: a classical kernel is like projecting data onto a fixed set of axes. A quantum kernel projects data onto an exponentially larger set of axes defined by quantum interference — some of which have no efficient classical description.
How It Works
Step 1: Encode Data (ZZ Feature Map)
Each data point x is encoded into a quantum state |φ(x)⟩ via the ZZ feature map:
CODE┌───┐┌────────┐ ┌───┐┌────────┐ q_0: ────┤ H ├┤ RZ(x₀π)├──■─────────────────■────┤ H ├┤ RZ(x₀π)├──■──... ├───┤├────────┤┌─┴─┐┌──────────┐┌─┴─┐ ├───┤├────────┤┌─┴─┐ q_1: ────┤ H ├┤ RZ(x₁π)├┤ X ├┤RZ(x₀x₁π)├┤ X ├──┤ H ├┤ RZ(x₁π)├┤ X ├──... ├───┤├────────┤└───┘└──────────┘└───┘ ├───┤├────────┤└───┘ q_2: ────┤ H ├┤ RZ(x₀π)├──■─────────────────■────┤ H ├┤ RZ(x₀π)├──■──... ├───┤├────────┤┌─┴─┐┌──────────┐┌─┴─┐ ├───┤├────────┤┌─┴─┐ q_3: ────┤ H ├┤ RZ(x₁π)├┤ X ├┤RZ(x₀x₁π)├┤ X ├──┤ H ├┤ RZ(x₁π)├┤ X ├──... └───┘└────────┘└───┘└──────────┘└───┘ └───┘└────────┘└───┘ ├──────── rep 1 ────────────────┤ ├──────── rep 2 ────────
- H + RZ(x_i π): Encodes each feature as a phase rotation in superposition
- CNOT-RZ-CNOT: Implements ZZ interaction exp(-i x_i x_j π ZZ), encoding feature products into entanglement
Step 2: Compute Kernel (Inversion Test)
The kernel K(x,y) = |⟨φ(x)|φ(y)⟩|² is computed via the inversion test:
|0...0⟩ ── U(x) ── U†(y) ── Measure P(|0...0⟩)
If x = y, the inverse perfectly undoes the encoding and P(|0...0⟩) = 1. If x ≠ y, the mismatch leaves residual amplitude in other basis states.
Step 3: Build Kernel Matrix
Evaluate K(x_i, x_j) for all pairs of training points:
CODEK = ┌ ┐ │ K(x₁,x₁) K(x₁,x₂) …│ │ K(x₂,x₁) K(x₂,x₂) …│ │ ⋮ ⋮ ⋱│ └ ┘
This requires O(n²) quantum circuit evaluations.
Step 4: Train Classical SVM
Feed the quantum kernel matrix to the SVM dual problem. The optimizer finds support vectors (data points closest to the decision boundary) and the maximum-margin classifier.
Step 5: Predict
For a new point x, compute K(x, x_i) against all training points and evaluate:
CODEf(x) = Σ α_i y_i K(x, x_i) + b prediction = sign(f(x))
The Math
ZZ Feature Map
The feature map applies the unitary:
U(x) = ∏_{rep} [ U_ZZ(x) · U_φ(x) ]
where:
CODEU_φ(x) = ⊗_i [ RZ(x_i π) · H ] (single-qubit encoding) U_ZZ(x) = ∏_{⟨i,j⟩} exp(-i x_i x_j π Z_i Z_j / 2) (entangling)
The ZZ gate is decomposed as CNOT → RZ → CNOT:
exp(-iθZZ) = CNOT · (I ⊗ RZ(2θ)) · CNOT
Kernel as Fidelity
The quantum kernel is the fidelity between encoded states:
K(x,y) = |⟨0|U†(x)U(y)|0⟩|² = |⟨φ(x)|φ(y)⟩|²
This is a valid Mercer kernel: symmetric and positive semi-definite by construction.
SVM Dual Problem
Given kernel matrix K and labels y ∈ {-1, +1}:
CODEmax_α Σ_i α_i - ½ Σ_{i,j} α_i α_j y_i y_j K(x_i, x_j) subject to: 0 ≤ α_i ≤ C (box constraint) Σ_i α_i y_i = 0 (class balance)
The solution identifies support vectors (α_i > 0) and the bias term b.
Quantum Advantage Condition
Quantum kernels offer advantage when:
- The target function lies in the RKHS of the quantum kernel (expressivity)
- The quantum kernel values are hard to estimate classically (hardness)
- Sufficient training data to learn the function (generalization)
Liu et al. (2021) proved a rigorous separation for specific data distributions.
Expected Output
| Metric | Expected Value |
|---|---|
| Accuracy | 80-95% (on separable synthetic data) |
| Support vectors | 3-8 (depends on margin) |
| K(x,x) | 1.0 (self-overlap) |
| K(x,y) range | [0, 1] |
| Kernel PSD | True |
| Circuit evaluations | O(n²) for n training points |
Running the Circuit
PYTHONfrom circuit import run_circuit, verify_quantum_kernel_svm # Train and evaluate quantum SVM result = run_circuit(n_samples=20, n_qubits=4) print(f"Accuracy: {result['accuracy']:.2%}") print(f"Support vectors: {result['n_support_vectors']}") # Verification suite v = verify_quantum_kernel_svm() for check in v["checks"]: status = "PASS" if check["passed"] else "FAIL" print(f"[{status}] {check['name']}: {check['detail']}")
Try It Yourself
-
Vary the regularization: Try
C=0.1(soft margin) vsC=100(hard margin) intrain_quantum_svm(). How does the number of support vectors change? -
Increase feature map depth: Set
reps=3orreps=4in the feature map. Does expressivity improve classification, or does it lead to overfitting (training accuracy >> test accuracy)? -
Non-separable data: Modify
run_circuitto use overlapping clusters (e.g., both centered at 0.5). How does the quantum SVM handle non-linearly-separable data compared to a classical RBF kernel? -
Scale the qubits: Try
n_qubits=2vsn_qubits=6. More qubits = larger Hilbert space, but does accuracy always improve? What happens to runtime? -
Compare with classical: Compute an RBF kernel
exp(-||x-y||²/2σ²)on the same data with sklearn. Is the quantum kernel competitive on this synthetic dataset?
What's Next
- Fidelity Kernel — SWAP test approach to kernel evaluation (compare with inversion test)
- Trainable Kernel — Add variational parameters to the feature map
- Projected Quantum Kernel — Classical post-processing of quantum measurements
Applications
| Domain | Use case |
|---|---|
| Drug discovery | Molecular property classification using quantum-encoded molecular descriptors |
| Anomaly detection | One-class SVM with quantum kernel for detecting outliers in high-dimensional data |
| Financial modeling | Classification of market regimes using quantum feature correlations |
| Material science | Classifying quantum phases of matter using experimentally measurable kernels |
References
- Havlicek, V. et al. (2019). "Supervised learning with quantum-enhanced feature spaces." Nature 567, 209-212. DOI: 10.1038/s41586-019-0980-2
- Schuld, M. & Killoran, N. (2019). "Quantum Machine Learning in Feature Hilbert Spaces." Physical Review Letters 122, 040504. DOI: 10.1103/PhysRevLett.122.040504
- Liu, Y. et al. (2021). "A rigorous and robust quantum speed-up in supervised machine learning." Nature Physics 17, 1013-1017. DOI: 10.1038/s41567-021-01287-z