This project aims to build an AI pipeline that converts 2D images (with 32-bit color) into 3D models using vector-based geometry (curves) and neural implicit surfaces (SDF), outputting high-fidelity, HDR-ready 3D assets in USD/glTF format.
- Input: High-res raster images (PNG/EXR, 32-bit color)
- Supports multi-image (multi-view) input for each 3D object
- Vectorize silhouettes to SVG curves
- Dual encoders: CNN for raster, MLP/Transformer for curves
- Neural SDF for geometry, color head for 32-bit RGBA
- Differentiable rendering (PyTorch3D/NVDiffRast/redner)
- Losses: photometric, silhouette IoU, eikonal, curve reconstruction, total variation
- Progressive training and validation
- Export watertight mesh with 32-bit textures (PNG/EXR)
- Package as USD/glTF
vector32-3D/
│
├── data/ # Datasets: images, SVGs, poses
├── vectorization/ # SVG extraction and cleaning
├── model/ # Encoders, SDF, color head, losses
├── renderer/ # Differentiable rendering
├── training/ # Training and validation scripts
├── export/ # Mesh extraction, UV unwrapping, texture baking
├── packaging/ # Export to USD/glTF
├── research/ # References and papers
└── README.md
- Define 3D representation (curves + SDF + 32-bit RGBA)
- Prepare dataset: raster images, SVG curves, camera poses
- Ensure dataset supports multiple images per object (multi-view)
- Vectorization pipeline (SVG extraction/cleaning)
- Dual-encoder model (CNN + MLP/Transformer)
- Design model to accept and aggregate features from multiple images/views
- SDF decoder and color head
- Differentiable renderer integration
- Loss functions
- Training loop
- Implement batching and aggregation for multi-image input
- Validation
- Validate on held-out multi-view images
- Mesh extraction and UV unwrapping
- Texture baking and export
- Packaging (USD/glTF)
- Research tracking
- diffvg, PyTorch3D, NVDiffRast, redner, NeuS, SIREN, EG3D, Illustration2VecSDF, OpenEXR, xatlas, Blender API
-
Clone the repo and install dependencies (see requirements.txt).
-
Prepare your dataset in
data/.- Organize images so each object has multiple views (multi-image support)
- If you use Blender you can run the blenderEXRdatasetscript.py to help make your dataset. To use it you need to edit the asset folder and output root in the script to the folder you are using.
After that you can run the script using
"file path to your Blender.exe" --background --python "file path to blenderEXRdatasetscript.py" -
Follow the roadmap to implement each module.
MIT
This pipeline provides a robust, extensible framework for loading, validating, and processing multi-view 3D datasets with 32-bit RGBA images, camera metadata, and downstream integration for vectorization, neural SDF, and asset schema workflows.
Images/Metadata → [Image Loader] → [Camera Metadata Parser] → [View Alignment] → [Consistency Checker] → [Pipeline Integration] → [Vectorization/SDF/Asset Schema]
- Supports 32-bit RGBA PNG and EXR files
- Usage:
from data.image_loader import ImageLoader arr = ImageLoader.load_image('view_000.exr')
- Dedicated EXR loader/writer for 32-bit float RGBA
- Usage:
from data.exr_image import EXRImage arr = EXRImage.load('img.exr') EXRImage.save('out.exr', arr)
- Loads/validates camera intrinsics/extrinsics from JSON/CSV
- Usage:
from data.camera_metadata import CameraMetadataParser entries = CameraMetadataParser.load('cameras.json')
- Normalizes/aligns camera poses to canonical frame
- Usage:
from data.view_alignment import ViewAlignment aligned_Rs, aligned_ts = ViewAlignment.align_cameras(Rs, ts)
- Validates image/metadata count, pose consistency, outliers
- Usage:
from data.consistency_checker import ConsistencyChecker ok = ConsistencyChecker.run_all_checks(images, metadata, 'cameras.json')
- Connects validated data to vectorization, SDF, asset schema
- Usage:
from data.pipeline_integration import PipelineIntegration integration = PipelineIntegration(asset) views = integration.get_all_views()
dataset_root/
asset_001/
images/
view_000.exr
view_001.exr
cameras.json
curves.json
metadata.json
[
{
"view_id": "0",
"image_path": "images/view_000.exr",
"intrinsics": [[1,0,0],[0,1,0],[0,0,1]],
"extrinsics": {"R": [[1,0,0],[0,1,0],[0,0,1]], "t": [0,0,0]}
}
]view_id,image_path,K_00,K_01,K_02,K_10,K_11,K_12,K_20,K_21,K_22,R_00,R_01,R_02,R_10,R_11,R_12,R_20,R_21,R_22,t_0,t_1,t_2
0,images/view_000.exr,1,0,0,0,1,0,0,0,1,1,0,0,0,1,0,0,0,1,0,0,0
- Vectorization: Use
to_vectorization_format()fromPipelineIntegrationfor curve extraction. - SDF Training: Use
to_sdf_training_format()for neural SDF workflows. - Asset Schema: All data is compatible with
MultiViewAssetfor export and further processing.
- Run all tests:
python -m data.run_tests
- All modules will report PASS/FAIL. See logs for error details.
- Troubleshooting: Check for missing files, shape mismatches, or metadata errors in logs.
- Add new image formats: Extend
ImageLoaderandEXRImage. - Add validation rules: Update
ConsistencyChecker. - Add downstream modules: Implement new adapters in
PipelineIntegration. - Update documentation and tests for all changes.
- All modules are documented with docstrings and usage examples.
- See
docs/for architecture diagrams and advanced usage. - New contributors can get started in 30 minutes by following this README and running the test suite.
For more details, see code docstrings and the docs/ directory.
ColorField is a modular, numpy-based UV space color predictor for 32-bit RGBA and HDR (float32) color fields. It is designed for research extensibility, robust input validation, and high-fidelity 3D asset generation.
input_dim(int): Input dimension (default: 2 for UV).output_dim(int): Output dimension (default: 4 for RGBA).hidden_layers(List[int]): Hidden layer sizes (default: [64, 64]).activation(Callable): Activation function (default:np.tanh).use_hdr(bool): If True, output is float32 (HDR); else, output is clamped to [0,1] (LDR).logger(logging.Logger): Optional logger for debug/info/warning messages.seed(int): Optional random seed for reproducibility.
forward(uv: np.ndarray) -> np.ndarray: Predict RGBA color for given UV coordinates.predict_multi_view(uv_list: List[np.ndarray]) -> np.ndarray: Aggregate predictions from multiple UV arrays for multi-view consistency.scale_texture(colors: np.ndarray, target_shape: tuple) -> np.ndarray: Scale predicted color texture to a new resolution (nearest neighbor).add_custom_head(func: Callable): Add a custom color head for research (e.g., spectral mapping).
from model.colorfield import ColorField
import numpy as np
cf = ColorField(input_dim=2, output_dim=4, use_hdr=False, seed=42)
uv = np.random.rand(10, 2).astype(np.float32)
colors = cf.forward(uv)
# Multi-view consistency
uv1 = np.random.rand(10, 2).astype(np.float32)
uv2 = np.random.rand(10, 2).astype(np.float32)
agg_colors = cf.predict_multi_view([uv1, uv2])
# Texture scaling
tex = np.random.rand(32, 32, 4).astype(np.float32)
scaled = cf.scale_texture(tex, (64, 64))
# Extensibility: custom color head
cf.add_custom_head(lambda x: np.ones_like(x) * 0.5)- LDR vs HDR: Set
use_hdr=Truefor float32 output (HDR, unclamped), orFalsefor [0,1] clamped output (LDR). - Extensibility: Use
add_custom_headto experiment with new color field heads (e.g., neural texture compression, spectral color fields). - Logging: Pass a custom logger or use the default for debug/info/warning output.
Unit and integration tests for ColorField are in model/test_model_colorfield.py. Run with:
python -m unittest model/test_model_colorfield.pyTests cover:
- Color prediction (LDR/HDR)
- Multi-view consistency
- Input validation (shapes/types/ranges)
- Extensibility (custom head)
- Texture scaling
The training progress tracking system now supports logging research paper citations and metadata via the VisualizationManager:
VisualizationManager.log_research_paper(paper: Dict[str, Any])paper: Dictionary with fields such astitle,authors,doi,bibtex,pdf_url,notes, etc.- Each backend (JSON, CSV, REST, etc.) will store or transmit the citation appropriately.
paper = {
'title': 'A Great Paper',
'authors': 'Doe, J.; Smith, A.',
'doi': '10.1234/example.doi',
'bibtex': '@article{doe2024, ...}',
'pdf_url': 'http://example.com/paper.pdf',
'notes': 'Key reference for method.'
}
manager.log_research_paper(paper)