Analytical benchmarking for Vision Transformer (ViT) pruning.
Estimate FLOPs, visualize token reduction, and optimize transmission without training.
PRUNEVISION exposes analytical algorithms to determine which "tokens" (parts of an image or embedding) are important.
Uses L2 Norm (Magnitude). Keeps tokens with the highest energy activation.
Measures information density within the token vector via Softmax/Log distribution.
Computes the Box-Counting Dimension (complexity) of the signal structure.
Evaluates token centrality and local variance relative to other tokens.
Since we are just starting out, your feedback is the most important tool for PRUNEVISION to evolve.
Process raw embedding tensors efficiently. Preferred for backend-to-backend communication.
High-performance endpoint accepting raw bytes (numpy/torch buffer). Avoids JSON parsing overhead.
| Name | Type | Description |
|---|---|---|
| method | string | entropy (default), static, fractal, neighborhood |
| prune_ratio | float | 0.0 to 1.0 (e.g., 0.5 removes 50% of tokens) |
| shape | string | Input shape "B,N,D" (e.g., "1,196,768") |
| return_binary | bool | If true, returns raw bytes of pruned tensor. |
Multipart/Form-Data: File upload containing raw float32 bytes.
Rate Limit: 20/min
import requests
import numpy as np
# Generate dummy data: Batch=1, Tokens=196, Dim=768
data = np.random.rand(1, 196, 768).astype(np.float32)
response = requests.post(
"http://localhost:8000/prune/embeddings-binary",
params={"shape": "1,196,768", "method": "fractal", "prune_ratio": 0.3},
files={"file": data.tobytes()}
)
print(response.json())
Directly upload images to analyze pruning behavior on visual patches.
Returns a PNG image showing which parts of the image were kept (color) and removed (black).
| file | Multipart image file (JPEG/PNG) |
| method | Pruning strategy to apply. |
| prune_ratio | Percentage of patches to mask out (e.g., 0.7). |
Converts image to tokens internally, prunes, and returns FLOPs/Reduction metrics.
{
"filename": "cat.jpg",
"method": "entropy",
"metrics": {
"original_tokens": 196,
"remaining_tokens": 98,
"token_reduction_ratio": 0.5,
"flops_reduction_ratio": 0.5012
}
}
/benchmark/compare-all
Upload a binary tensor once. The server runs all 4 algorithms and returns a JSON comparison list sorted by signal preservation score.
Ideal for Research Papers/optimize/transmission
Compresses an image into a custom .spv (Sparse Vision) format containing only high-entropy patches.
X-Savings-Percent header.