Variational Quantum Classifier
What You'll Learn:
- How to encode classical data into quantum states using rotation gates
- How variational layers create trainable quantum transformations
- How to train a quantum classifier with classical optimization
- The encode → process → measure paradigm of quantum machine learning
Level: Intermediate | Time: 25 minutes | Qubits: 2 | Framework: Qiskit
Prerequisites
- Bell State — CX entanglement basics
- H2 Ground State — variational optimization
The Idea
A neural network takes input data, passes it through trainable layers, and outputs a prediction. A variational quantum classifier does the same thing — on a quantum computer.
The key differences: (1) the "input layer" encodes data as qubit rotations instead of neuron activations, (2) the "hidden layers" are parameterized quantum gates instead of weight matrices, and (3) the "output" is a measurement probability instead of a softmax.
For binary classification (is this email spam or not?), we encode 2 features into 2 qubits, apply trainable layers, and measure qubit 0: if P(|1⟩) > 0.5, predict class 1.
Why quantum? For small problems like this, classical classifiers are faster. The potential advantage comes at scale: quantum feature spaces grow exponentially with qubits, potentially capturing patterns that classical models miss.
How It Works
The Architecture
CODE┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ q_0: ┤ RY(x₀·π)├─┤ RY(θ₀) ├──■──┤ RZ(θ₂) ├─┤ RY(θ₄) ├──■──┤ RZ(θ₆) ├─ M ├──────────┤ ├──────────┤┌─┴─┐├──────────┤ ├──────────┤┌─┴─┐├──────────┤ q_1: ┤ RY(x₁·π)├─┤ RY(θ₁) ├┤ X ├┤ RZ(θ₃) ├─┤ RY(θ₅) ├┤ X ├┤ RZ(θ₇) ├ └──────────┘ └──────────┘└───┘└──────────┘ └──────────┘└───┘└──────────┘ |─ Encode ─| |──────── Layer 1 ──────────| |──────── Layer 2 ──────────|
Step 1: Feature Encoding
Each feature x_i ∈ [-1, 1] is mapped to a rotation angle:
PYTHONqc.ry(x[0] * np.pi, 0) # x=0 → |0⟩, x=1 → |1⟩, x=0.5 → superposition qc.ry(x[1] * np.pi, 1)
This places the data point on the Bloch sphere — different inputs produce different quantum states.
Step 2: Variational Layers
Each layer has 3 components:
- RY rotations: Mix amplitudes (like weights in a neural network)
- CX gate: Entangle qubits (creates correlations classical models can't)
- RZ rotations: Adjust phases (adds expressibility)
With 2 layers and 4 parameters each: 8 trainable parameters.
Step 3: Measurement
Measure qubit 0: P(|1⟩) = probability of class 1. If P(|1⟩) > 0.5, predict class 1.
PYTHONfrom circuit import predict, train_classifier # Single prediction result = predict([0.5, 0.5], theta=[0.1]*8) print(f"Class: {result['prediction']}, Confidence: {result['confidence']:.1%}") # Train on dataset trained = train_classifier(max_iterations=80, seed=42) print(f"Accuracy: {trained['accuracy']:.1%}")
The Math
Encoding Layer
The feature map F(x) prepares state:
|ψ(x)⟩ = RY(x₁π) ⊗ RY(x₀π) |00⟩
Each RY(θ) = cos(θ/2)|0⟩ + sin(θ/2)|1⟩, so the data point is encoded as rotation angles on the Bloch sphere.
Variational Layer
Each layer applies:
V(θ) = (RZ(θ₃) ⊗ RZ(θ₂)) · CX · (RY(θ₁) ⊗ RY(θ₀))
The full classifier maps input x to class probability:
P(class=1 | x, θ) = |⟨1₀| V_L(θ_L) ... V_1(θ_1) F(x) |00⟩|²
where ⟨1₀| measures qubit 0 in state |1⟩.
Training
Minimize the loss over dataset {(x_i, y_i)}:
L(θ) = 1 - (1/N) Σᵢ [yᵢ = ŷᵢ(θ)]
where ŷᵢ is the prediction for input xᵢ. COBYLA optimizes θ to maximize accuracy.
Expected Output
| Metric | Value |
|---|---|
| Random baseline accuracy | ~50% |
| Trained accuracy (6 samples) | >66% (typically 83-100%) |
| Parameters | 8 (2 layers × 4) |
| Training iterations | ~50-80 |
Running the Circuit
PYTHONfrom circuit import run_circuit, train_classifier, verify_classifier, evaluate_dataset # Single prediction result = run_circuit(x=[0.8, -0.2]) print(f"Prediction: class {result['prediction']}") # Train trained = train_classifier(max_iterations=80, shots=512, seed=42) print(f"Accuracy: {trained['accuracy']:.1%}") # Verify v = verify_classifier() for check in v["checks"]: print(f"[{'PASS' if check['passed'] else 'FAIL'}] {check['name']}")
Try It Yourself
-
Add more layers: Train with
n_layers=3(12 parameters). Does accuracy improve? Is training slower? -
Try XOR data: Use
X=[[1,1],[-1,-1],[1,-1],[-1,1]], y=[0,0,1,1]. Can the classifier learn XOR? (Hint: it needs entanglement.) -
More data: Generate 20 random samples from two Gaussian clusters. Does the classifier generalize?
-
Remove entanglement: Replace CX with identity. Train again. How much accuracy is lost?
-
Feature map comparison: Replace RY encoding with RZ encoding. Does the classifier still learn?
What's Next
- Quantum Kernel — Kernel methods instead of variational classification
- Data Re-uploading — Universal classification on a single qubit
- Amplitude Encoding — Exponential data compression
Applications
| Domain | Use case |
|---|---|
| Drug discovery | Molecular property classification |
| Finance | Credit risk assessment, fraud detection |
| Materials | Phase classification in quantum materials |
| HEP | Particle identification in detector data |
References
- Schuld, M. et al. (2020). "Circuit-centric quantum classifiers." Physical Review A 101, 032308. DOI: 10.1103/PhysRevA.101.032308
- Havlicek, V. et al. (2019). "Supervised learning with quantum-enhanced feature spaces." Nature 567, 209-212. DOI: 10.1038/s41586-019-0980-2
- Perez-Salinas, A. et al. (2020). "Data re-uploading for a universal quantum classifier." Quantum 4, 226. DOI: 10.22331/q-2020-02-06-226