Microsoft TRELLIS Tutorial Guide to 3D Generation with Image and Text
3 min read
data:image/s3,"s3://crabby-images/6e5fd/6e5fd4bba317bfb135560cdd24c37b244d0a927c" alt="Cover image for Microsoft TRELLIS Tutorial Guide to 3D Generation with Image and Text"
Introduction to TRELLIS
TRELLIS is a large-scale 3D asset generation model open-sourced by Microsoft that supports high-quality 3D content generation from text or images. It employs a structured 3D latent space approach to achieve scalable and versatile 3D generation.
Key Features:
- Supports both image-to-3D and text-to-3D generation modes
- Uses structured 3D latent space approach for higher generation quality
- Provides multiple 3D representation formats (Gaussian point clouds, radiance fields, meshes, etc.)
- Open source and easy to deploy
- Supports export to standard 3D file formats like GLB/PLY
Online Demo
data:image/s3,"s3://crabby-images/8a592/8a5924b254057c618d0be05914162d87ceb23380" alt="trellis-example"
Quick Start
System Requirements
- CUDA-compatible NVIDIA GPU (RTX 30/40 series recommended)
- CUDA Toolkit 11.8 or 12.2
- Python 3.8+
- conda package manager
Installation Steps
- Clone the repository:
git clone https://github.com/microsoft/TRELLIS.git
cd TRELLIS
- Create and activate conda environment:
# Using CUDA 11.8
. ./setup.sh --new-env --basic --xformers --flash-attn --diffoctreerast --spconv --mipgaussian --kaolin --nvdiffrast
# For CUDA 12.2, manually install dependencies
conda create -n trellis python=3.10
conda activate trellis
pip install -r requirements.txt
- Download pre-trained models:
Currently available pre-trained models include:
- TRELLIS-image-large: Large image-to-3D model (1.2B parameters)
- TRELLIS-text-base: Base text-to-3D model (342M parameters)
- TRELLIS-text-large: Large text-to-3D model (1.1B parameters)
- TRELLIS-text-xlarge: Extra-large text-to-3D model (2.0B parameters)
You can download these models from Hugging Face.
Usage Tutorial
Image-to-3D Example
import os
# Set backend
os.environ['SPCONV_ALGO'] = 'native' # Options: 'native' or 'auto'
import imageio
from PIL import Image
from trellis.pipelines import TrellisImageTo3DPipeline
from trellis.utils import render_utils, postprocessing_utils
# Load model
pipeline = TrellisImageTo3DPipeline.from_pretrained("JeffreyXiang/TRELLIS-image-large")
pipeline.cuda()
# Load input image
image = Image.open("input.png")
# Run generation
outputs = pipeline.run(
image,
seed=1,
# Optional parameters
# sparse_structure_sampler_params={
# "steps": 12,
# "cfg_strength": 7.5,
# },
# slat_sampler_params={
# "steps": 12,
# "cfg_strength": 3,
# },
)
# Render preview video
video = render_utils.render_video(outputs['gaussian'][0])['color']
imageio.mimsave("preview_gs.mp4", video, fps=30)
# Export 3D file
glb = postprocessing_utils.to_glb(
outputs['gaussian'][0],
outputs['mesh'][0],
simplify=0.95, # Mesh simplification ratio
texture_size=1024, # Texture size
)
glb.export("output.glb")
# Save point cloud data
outputs['gaussian'][0].save_ply("output.ply")
Web Demo Interface
TRELLIS provides a Gradio-based web demo interface. Run the following commands to start:
# Install additional dependencies
. ./setup.sh --demo
# Start service
python app.py
After starting, you can access the web interface through your browser.
Best Practices
- Input Image Recommendations:
- Use clear images with moderate contrast
- Ensure object contours are clearly visible
- Avoid complex backgrounds and occlusions
- Generation Parameter Tuning:
- Increase sampling steps for better quality
- Adjust cfg_strength parameter to control generation fidelity
- Try different random seeds
- Performance Optimization:
- Use native backend for faster initial runs
- Reduce texture resolution to decrease VRAM usage
- Adjust mesh simplification ratio as needed during export
Common Issues
- Insufficient VRAM:
- Reduce batch size
- Use smaller model versions
- Decrease sampling steps
- Generation Quality Issues:
- Check input image quality
- Increase sampling steps
- Adjust cfg_strength parameter
- Large Export Files:
- Increase mesh simplification ratio
- Reduce texture resolution
- Choose appropriate file formats
References
More Articles
![OpenAI 12-Day Technical Livestream Highlights Detailed Report [December 2024]](/_astro/openai-12day.C2KzT-7l_1ndTgg.jpg)
OpenAI 12-Day Technical Livestream Highlights Detailed Report [December 2024]
data:image/s3,"s3://crabby-images/c1bf5/c1bf5865286d00ab4d17bfbd91f2ce0a455a13a8" alt="AI Model Tools Comparison How to Choose Between SGLang, Ollama, VLLM, and LLaMA.cpp?"
AI Model Tools Comparison How to Choose Between SGLang, Ollama, VLLM, and LLaMA.cpp?
data:image/s3,"s3://crabby-images/49e7e/49e7e96fe1847e6c7e1030520babdee5000eec35" alt="Ant Design X - React Component Library for Building AI Chat Applications"
Ant Design X - React Component Library for Building AI Chat Applications
data:image/s3,"s3://crabby-images/a4ba7/a4ba7c68c21d4134a0e14972d54e27dae70d4913" alt="CES 2024 Review:Revisiting the Tech Highlights of 2024"
CES 2024 Review:Revisiting the Tech Highlights of 2024
data:image/s3,"s3://crabby-images/61b97/61b970a4a7550922b5a124c43e3ee9497f307957" alt="VLC Automatic Subtitles and Translation (Based on Local Offline Open-Source AI Models) | CES 2025"
VLC Automatic Subtitles and Translation (Based on Local Offline Open-Source AI Models) | CES 2025
data:image/s3,"s3://crabby-images/37926/37926da66646b6654210b1c1a3480ccfc02878f9" alt="ClearerVoice-Studio: A One-Stop Solution for Speech Enhancement, Speech Denoising, Speech Separation and Speaker Extraction"
ClearerVoice-Studio: A One-Stop Solution for Speech Enhancement, Speech Denoising, Speech Separation and Speaker Extraction
data:image/s3,"s3://crabby-images/ae5fe/ae5fe3499027252db14d2a4582a86100a22c4f39" alt="CogAgent-9B Released: A GUI Interaction Model Jointly Developed by Zhipu AI and Tsinghua"
CogAgent-9B Released: A GUI Interaction Model Jointly Developed by Zhipu AI and Tsinghua
data:image/s3,"s3://crabby-images/30358/303582cbd83d7ea7faaaa213711feae8f4958f41" alt="How to Install and Use ComfyUI on Windows - Complete Guide"
How to Install and Use ComfyUI on Windows - Complete Guide
data:image/s3,"s3://crabby-images/9b527/9b52774d783754cb309b77477f9a28ccd479cab5" alt="DeepSeek-V3 Model In-Depth Analysis: A Brilliant Star in the New AI Era"
DeepSeek-V3 Model In-Depth Analysis: A Brilliant Star in the New AI Era