As data scientists, we are constantly seeking tools and frameworks that enable us to efficiently process and analyze data. In this blog post, we will explore OpenUSD, a powerful framework that goes beyond its traditional use in computer graphics and offers exciting possibilities for data science pipelines.
OpenUSD, or Universal Scene Description, provides a versatile and extensible platform for managing and processing complex data models. It can represent a wide range of data types and enhance datasets in various domains.
Let's dive into what data scientists should know about OpenUSD and how it can enhance their workflows.
Common Data Modeling
OpenUSD introduces a unified data model that allows data scientists to represent and manipulate complex 3D data structures efficiently. With USD, object data can be organized into hierarchical scene graphs. This hierarchical structure is particularly useful when dealing with large-scale datasets or complex data dependencies.
Entering into the OpenUSD ecosystem also enables easy sharing and reuse of data. Data sources in OpenUSD can be more easily integrated into an aggregate view that can encompass content from other file formats.
File Format Plugins
USD File format plugins provide a way to leverage the power of OpenUSD while keeping your existing datasets in their current formats. File format plugins can read and translate a file format into OpenUSD data on the fly.
For example: in 3D data science, Wavefront OBJ files are popular for 3D mesh data, and there are large datasets that use this format. With an OBJ file format plugin like the plugin recently open-sourced by Adobe, you can reference existing OBJ data and compose it in OpenUSD to add or override attributes or use it for scene assembly. The following kitchen.usd shows an example of assembling a kitchen scene using OBJ models for a teapot and a table. The teapot’s position in the scene is overridden to rotate it and move it above the table.
kitchen.usd
#usda 1.0
(
defaultPrim = "World"
metersPerUnit = 1.0
upAxis = "Z"
)
def Xform "World"
{
def "teapot" (prepend references = @./utah_teapot.obj@)
{
float3 xformOp:rotateXYZ = (0, 0, 45.0)
float3 xformOp:scale = (1, 1, 1)
double3 xformOp:translate = (0, 0, 1.0)
uniform token[] xformOpOrder = ["xformOp:translate", "xformOp:rotateXYZ", "xformOp:scale"]
}
def "table" (prepend references = @./table.obj@) { }
}
Composability
OpenUSD excels as a composable scene description. This takes form in two main ways: scene aggregation and progressive refinement or enhancement. Scene aggregation involves referencing many 3D assets from different sources and non-destructively assembling them to form a larger scene. You can make changes to the referenced 3D assets and the assemblies will also pick up the change. Progressive refinement allows you to start with a coarse, low-information asset before progressively and non-destructively layering in details or changing parameters to further refine it from coarse to fine.
Looking again at the example of the OBJ mesh from earlier, you can start with just the mesh data from the OBJ and use OpenUSD to add physical material properties, semantic labels, and other ancillary aspects such as geospatial attribution. In this example, the refinement is composed using sublayers for the different types of details I want to add to my asset.
teapot.usd
#usda 1.0
(
defaultPrim = "World"
metersPerUnit = 1.0
upAxis = "Z"
subLayers = [
@./semantic_labels.usd@
@./materials.usd@
@./utah_teapot.obj@
]
)
def Xform "World"
{
}
Building your datasets like this makes it extremely portable and modular. It also allows you to improve the fidelity and quality of data sources.
I can share the mesh with all of the attributes or I can mute or remove the layers that are not relevant for different pipelines. The SimReady specification is an example of these principles in practice today.
Custom Pipelining
OpenUSD's Hydra framework offers data scientists the ability to create custom pipelines for processing and analyzing data. Hydra allows for the implementation of business logic as a customizable chain of runtime scene indexes. This decoupling of data processing from specific runtime environments enables data scientists to leverage the power of USD in their own data science workflows.
Extensibility
One of the key strengths of OpenUSD is its extensibility. Data scientists can extend OpenUSD's capabilities by creating their own scene delegates and render delegates. This means that any scene graph capable of answering queries served by scene delegates can be used, providing flexibility in integrating diverse data sources and formats.
OpenUSD is also extensible through custom schemas. As data scientists begin to map concepts from their data models to OpenUSD, they may find that not every concept maps directly and a translation to an existing concept in OpenUSD may not be suitable. When data scientists identify a conceptual data mapping gap, they can formalize the novel concept into a new schema that can be leveraged immediately.
As the schema matures, data scientists are encouraged to share their schemas with other organizations and institutions and to take the schema through the full schema journey so that it can be reviewed, published and standardized. A good example of this is the semantic schema proposal from NVIDIA to standardize semantic labeling of 3D assets for synthetic data generation.
Procedural Processing with Hydra 2.0
Hydra 2.0 takes OpenUSD's capabilities to the next level by introducing procedural processing of scene indexes. This allows data scientists to process chains of scene indexes through multiple pipeline steps, enabling more complex and customizable workflows. With Hydra 2.0, data scientists can iterate and optimize their pipelines, making it easier to experiment with different data processing techniques. Scene index plugins are also portable so that you can share their modular business logic between OpenUSD applications.
OpenUSD offers data scientists a powerful and versatile framework for managing and processing complex data models. Its unified data model, extensibility, and generality make it an invaluable framework for data science workflows and pipelines. With extensibility in both common data modeling via schema plugins, and runtime kernels in Hydra 2.0, OpenUSD empowers data scientists to efficiently process and analyze large-scale datasets, enabling faster and more scalable computations. As data scientists, it is essential to explore and leverage tools like OpenUSD to unlock the full potential of our data-driven endeavors.
A growing number of tools and applications already support OpenUSD import and export. Developers can learn how to add OpenUSD support to their applications in NVIDIA’s OpenUSD Documentation, which includes first steps, guided learning, and technical references to get started.
To access more resources and get started with OpenUSD, visit NVIDIA’s Universal Scene Description page. Get started with NVIDIA Omniverse by downloading the standard license for free.
The Alliance for OpenUSD (AOUSD) is an open, non-profit organization dedicated to promoting the interoperability of 3D content through OpenUSD. Learn more and become a member today.