Quick Start¶
Get your first surgical prediction running in 5 minutes.
1. Install¶
git clone https://github.com/dreamlessx/LandmarkDiff-public.git
cd LandmarkDiff-public
pip install -e .
This installs the core package with all inference dependencies (MediaPipe, PyTorch, diffusers, transformers).
For GPU acceleration (xformers + triton):
Verify the installation:
python -c "
import landmarkdiff
from landmarkdiff.landmarks import extract_landmarks
from landmarkdiff.manipulation import apply_procedure_preset
print(f'LandmarkDiff {landmarkdiff.__version__} installed successfully')
"
2. Prepare an Input Image¶
Use any front-facing photo. For best results: - Clear, well-lit face - Frontal pose (slight angles are fine) - At least 256x256 resolution (images are resized to 512x512 internally) - No sunglasses or face masks
3. Run Your First Prediction (TPS mode, no GPU)¶
TPS mode uses thin-plate spline warping and runs entirely on CPU. It is the fastest way to see LandmarkDiff in action:
This will:
1. Detect 478 facial landmarks using MediaPipe
2. Deform nose landmarks according to the rhinoplasty preset
3. Apply a thin-plate spline warp to the face image
4. Save the output to output/
The result appears in output/photo_rhinoplasty/before_after.png as a side-by-side comparison.
Python API¶
The same thing in three lines of Python:
from landmarkdiff.inference import LandmarkDiffPipeline
pipeline = LandmarkDiffPipeline(mode="tps", device="cpu")
result = pipeline.generate("photo.jpg", procedure="rhinoplasty", intensity=60)
The result dictionary contains:
- result["output"]: the final composited image
- result["input"]: the original 512x512 input
- result["conditioning"]: the deformed wireframe mesh overlay
- result["mask"]: the surgical region mask
4. GPU Mode (Photorealistic Results)¶
If you have a GPU with 6+ GB VRAM, use ControlNet mode for diffusion-based photorealistic output:
Or in Python:
pipeline = LandmarkDiffPipeline(mode="controlnet", device="cuda")
result = pipeline.generate("photo.jpg", procedure="rhinoplasty", intensity=60)
The first run downloads the SD1.5 and ControlNet weights (~5 GB). Subsequent runs reuse the cached models.
5. Try Different Procedures¶
LandmarkDiff supports 6 procedures:
# Nose reshaping
python scripts/run_inference.py photo.jpg --procedure rhinoplasty --intensity 50
# Eyelid surgery
python scripts/run_inference.py photo.jpg --procedure blepharoplasty --intensity 50
# Facelift
python scripts/run_inference.py photo.jpg --procedure rhytidectomy --intensity 60
# Jaw surgery
python scripts/run_inference.py photo.jpg --procedure orthognathic --intensity 50
# Brow lift
python scripts/run_inference.py photo.jpg --procedure brow_lift --intensity 45
# Chin reshaping
python scripts/run_inference.py photo.jpg --procedure mentoplasty --intensity 50
6. Adjust Intensity¶
The --intensity flag controls how dramatic the change is (0 = no change, 100 = maximum surgical effect):
# Subtle
python scripts/run_inference.py photo.jpg --procedure rhinoplasty --intensity 25
# Moderate
python scripts/run_inference.py photo.jpg --procedure rhinoplasty --intensity 50
# Aggressive
python scripts/run_inference.py photo.jpg --procedure rhinoplasty --intensity 85
A good starting point for most procedures is 40-60.
7. Launch the Interactive Demo¶
The Gradio web demo provides a visual interface with drag-and-drop upload and real-time sliders:
Open http://localhost:7860 in your browser. The demo provides: - Drag-and-drop image upload - Procedure selection dropdown - Intensity slider - Side-by-side before/after comparison
A hosted demo is also available at huggingface.co/spaces/dreamlessx/LandmarkDiff (TPS mode, CPU).
8. Combine Procedures¶
Apply multiple procedures sequentially for comprehensive facial planning:
from landmarkdiff.inference import LandmarkDiffPipeline
pipeline = LandmarkDiffPipeline(mode="tps", device="cpu")
# First pass: rhinoplasty
result = pipeline.generate("photo.jpg", procedure="rhinoplasty", intensity=50)
# Save intermediate, then apply chin work
from PIL import Image
import numpy as np
Image.fromarray(result["output"]).save("step1.jpg")
result2 = pipeline.generate("step1.jpg", procedure="mentoplasty", intensity=40)
Or at the landmark level for more precise control:
from landmarkdiff.landmarks import extract_landmarks
from landmarkdiff.manipulation import apply_procedure_preset
face = extract_landmarks(image)
step1 = apply_procedure_preset(face, "rhinoplasty", intensity=50)
step2 = apply_procedure_preset(step1, "mentoplasty", intensity=40)
Next Steps¶
- Custom Procedures: define your own surgical procedure preset
- Training: train on your own data
- Evaluation: evaluate prediction quality with metrics
- Deployment: deploy as an API or Docker service
- API Reference: full module documentation
- Procedure Docs: detailed anatomy and displacement info for each procedure
- Interactive Notebook: step-by-step Jupyter notebook with visualizations