Tech Explorer Logo

Search Content

Microsoft TRELLIS Tutorial Guide to 3D Generation with Image and Text

3 min read
Cover image for Microsoft TRELLIS Tutorial Guide to 3D Generation with Image and Text

Introduction to TRELLIS

TRELLIS is a large-scale 3D asset generation model open-sourced by Microsoft that supports high-quality 3D content generation from text or images. It employs a structured 3D latent space approach to achieve scalable and versatile 3D generation.

Key Features:

  • Supports both image-to-3D and text-to-3D generation modes
  • Uses structured 3D latent space approach for higher generation quality
  • Provides multiple 3D representation formats (Gaussian point clouds, radiance fields, meshes, etc.)
  • Open source and easy to deploy
  • Supports export to standard 3D file formats like GLB/PLY

Online Demo

trellis-example

Quick Start

System Requirements

  • CUDA-compatible NVIDIA GPU (RTX 30/40 series recommended)
  • CUDA Toolkit 11.8 or 12.2
  • Python 3.8+
  • conda package manager

Installation Steps

  1. Clone the repository:
   git clone https://github.com/microsoft/TRELLIS.git
cd TRELLIS
  1. Create and activate conda environment:
   # Using CUDA 11.8
. ./setup.sh --new-env --basic --xformers --flash-attn --diffoctreerast --spconv --mipgaussian --kaolin --nvdiffrast

# For CUDA 12.2, manually install dependencies
conda create -n trellis python=3.10
conda activate trellis
pip install -r requirements.txt
  1. Download pre-trained models:

Currently available pre-trained models include:

  • TRELLIS-image-large: Large image-to-3D model (1.2B parameters)
  • TRELLIS-text-base: Base text-to-3D model (342M parameters)
  • TRELLIS-text-large: Large text-to-3D model (1.1B parameters)
  • TRELLIS-text-xlarge: Extra-large text-to-3D model (2.0B parameters)

You can download these models from Hugging Face.

Usage Tutorial

Image-to-3D Example

   import os
# Set backend
os.environ['SPCONV_ALGO'] = 'native'  # Options: 'native' or 'auto'

import imageio
from PIL import Image
from trellis.pipelines import TrellisImageTo3DPipeline
from trellis.utils import render_utils, postprocessing_utils

# Load model
pipeline = TrellisImageTo3DPipeline.from_pretrained("JeffreyXiang/TRELLIS-image-large")
pipeline.cuda()

# Load input image
image = Image.open("input.png")

# Run generation
outputs = pipeline.run(
    image,
    seed=1,
    # Optional parameters
    # sparse_structure_sampler_params={
    #     "steps": 12,
    #     "cfg_strength": 7.5,
    # },
    # slat_sampler_params={
    #     "steps": 12,
    #     "cfg_strength": 3,
    # },
)

# Render preview video
video = render_utils.render_video(outputs['gaussian'][0])['color']
imageio.mimsave("preview_gs.mp4", video, fps=30)

# Export 3D file
glb = postprocessing_utils.to_glb(
    outputs['gaussian'][0],
    outputs['mesh'][0],
    simplify=0.95,          # Mesh simplification ratio
    texture_size=1024,      # Texture size
)
glb.export("output.glb")

# Save point cloud data
outputs['gaussian'][0].save_ply("output.ply")

Web Demo Interface

TRELLIS provides a Gradio-based web demo interface. Run the following commands to start:

   # Install additional dependencies
. ./setup.sh --demo

# Start service
python app.py

After starting, you can access the web interface through your browser.

Best Practices

  1. Input Image Recommendations:
  • Use clear images with moderate contrast
  • Ensure object contours are clearly visible
  • Avoid complex backgrounds and occlusions
  1. Generation Parameter Tuning:
  • Increase sampling steps for better quality
  • Adjust cfg_strength parameter to control generation fidelity
  • Try different random seeds
  1. Performance Optimization:
  • Use native backend for faster initial runs
  • Reduce texture resolution to decrease VRAM usage
  • Adjust mesh simplification ratio as needed during export

Common Issues

  1. Insufficient VRAM:
  • Reduce batch size
  • Use smaller model versions
  • Decrease sampling steps
  1. Generation Quality Issues:
  • Check input image quality
  • Increase sampling steps
  • Adjust cfg_strength parameter
  1. Large Export Files:
  • Increase mesh simplification ratio
  • Reduce texture resolution
  • Choose appropriate file formats

References

Share

More Articles