Aize News and Resources

The Recommended Starter Dataset for Digital Twin Success

Written by Maisa Monteiro Da Cunha | November 21, 2025
GUIDE NO. 01 – DATA ACTIVATION SERIES

 

This is a practical introduction to activating value from key datasets and features in your digital twin.

Laying the data foundation for a seamless onboarding journey

Building a digital twin is a transformative step in digitising operations, enhancing asset information connectivity and management, and unlocking actionable insights across the lifecycle of industrial systems. However, the success of the digital twin initiative is fundamentally rooted in the quality and completeness of the data it is built upon. This document defines the Recommended Starter Dataset required to support the initial build of a digital twin. By “Recommended Starter Dataset” we mean the core set of data needed to create a digital twin that is functional, accessible to users, and capable of delivering immediate value, while acknowledging that further data enrichment can follow.

Your digital twin starts here

This guide aims to:

  • Clarify what constitutes a Recommended Starter Dataset across different domains (e.g., engineering, operations, maintenance).
  • Align stakeholders around clear expectations and deliverables.
  • Accelerate the initial deployment while laying a scalable foundation for future growth.

The goal is to help project teams and data owners focus their efforts efficiently, delivering value early and often in their digital twin journey.

Our role in Aize

At Aize, we believe that the success of any digital twin initiative hinges not just on technology, but on the quality of data, collaboration, planning, and support that underpins it. Our Delivery team will ensure that our customers are set up for success from day one. That starts with establishing a realistic and practical baseline for data readiness, striking a careful balance between ambition and feasibility. We work closely with stakeholders across disciplines to identify the Recommended Starter Dataset required, align expectations, and ensure that the right people and processes are in place to support delivery.

From Aize, we bring:

  • A proven platform for building and scaling digital twins, capable of ingesting and contextualising engineering, operational, and maintenance data.
  • Visual tools and spatial interfaces that allow users to interact meaningfully with their facility data from the outset.
  • Expertise in data contextualisation, quality checks, and digital model alignment to ensure the data is not just available, but usable.
  • Close collaboration with customer teams, including IT, engineering, and operations, to build shared understanding and momentum.

Together, Aize and the customer co-create a deployment path that delivers early value while laying the groundwork for future growth and capability.

What data makes a digital twin work: the Recommended Starter Dataset

Below is a breakdown of what constitutes the Recommended Starter Dataset across the key domains. These inputs ensure that the digital twin is usable, navigable, and foundational for growth.

Core system interfaces

  • Document Management System: to be accessible to link or obtain documents.
  • Tag Database: recommended to provide a tag export.

Additional interfaces

  • CMMS (Computerized Maintenance Management System): recommended source for pulling key maintenance records. Enables visibility into work orders, asset history, and maintenance planning.
  • Scan Hosting Platform: hosts laser scans of the facility. Provides a critical visual anchor for spatial orientation, clash detection, and integration with engineering models or digital twins.
  • IDMS (Integrity Data Management System): supports the monitoring and management of asset integrity, including corrosion monitoring, risk assessments, inspection planning, and anomaly tracking.
  • Commissioning System: tracks pre-startup checks, system readiness, and handover processes. Supports progress monitoring and system validation prior to operations.
  • Construction Management System: captures execution plans, work packages, and as-built records. Integrates with engineering data to align actual vs. planned construction scope.
  • Procurement System: manages vendor data, material tracking, and delivery milestones. Ensures traceability of equipment and bulk materials from sourcing to site.

3D Model requirements

  • Geometry: accurate, representative as-built 3D model-essential for spatial orientation.
  • Naming Convention: tags should align with customers engineering naming convention.
  • Format: Aize-compatible format, preferably RVM/ATT or NWD.

Engineering documents

  • P&IDs: (Native format: .dwg + .pdf): core asset navigation and tag-based visual linking. Machine-readable PDFs enable the generation of smart drawings.
  • Isometrics: (Native format: .dwg + .pdf): detailed pipe segment drawings used for fabrication and construction. Include welds, supports, and dimensions—essential for spooling, material take-off, and field installation.
  • Plot Plans: provide spatial layout and navigational context for assets and systems within the facility.

Data reports/exports

  • Tag Register with Parent Relationships: captures the full hierarchy of tags with associated functional locations, supporting structured navigation and traceability.
  • System Register: logical groupings of equipment and pipelines, offering a system-level view of operational and engineering contexts.
  • Document-to-Tag References: enables retrieval of related documents based on asset/tag queries, supporting fast access during maintenance, operations, and engineering reviews.

Note: Missing P&IDs, tag registers, or system registers could limit core functionality like navigation, smart linking, and document associations. These elements should be prioritized but we can work with you to improve this data.

Basic maintenance & inspection linkage

  • Work Orders & Functional Locations: enables navigation to historical tasks and maintenance status.
  • Tag-to-Work Order Link: traceability for asset records (can be built incrementally).
  • Inspection Results: structured and digitised (e.g., wall thickness).

Reality capture data

Laser Scan Data

  • Must be referenced to plant/geospatial coordinate grid.
  • High quality and accuracy preferable (±3mm, colour scans/images).
  • .E57 Format: interoperable format for point cloud ingestion.
  • Scan Naming and Metadata preferable: clarifying location, date of capture and orientation.

360° Panoramic Images

  • Optional, helpful for visual validation if 3D model is limited.

Reality capture and 3D Model data usage

Legacy laser scan data will be made available through the customer’s existing vendors. This data is expected to vary in quality, completeness, and naming conventions. Nonetheless, the intent is to utilise this existing data, this applies equally to legacy 3D model data. Despite inconsistencies or incompleteness, this data remains valuable as it provides reality-based insights essential for verifying 3D models and associated documentation, as well as for developing the visualisation layer of the digital twin. Reality capture data adds significant value by supporting the verification of as-built conditions, filling in model gaps, and aiding in validation workflows.

Bringing it together: the essentials behind the recommendation

This Recommended Starter Dataset allows users to:

Explore the asset spatially

  • Navigate a realistic, spatially accurate representation of the asset.
  • Visually verify equipment, structures, and layouts as they exist in the real world.

Link tags to documents and systems

  • Connect physical objects in the model to related P&IDs, datasheets, inspection records, and enterprise systems.
  • Enable traceability and fast access to supporting information.

Access critical design and operational context

  • Visual context improves understanding of how systems are designed, installed, and operated.
  • Enables better decision-making through visual cross-reference.

Instant access to siloed datasets

  • Surface previously disconnected datasets (e.g. scans, models, documents, tags) in one unified spatial interface.
  • Reduces time spent hunting across folders, systems, and drawings. 

Expand functionality over time

  • Serves as a foundation for layering new capabilities.
  • The model grows in value as more data and integrations are added.

Remote familiarization and planning

  • Allows engineers, operators, and contractors to virtually walk down the facility from anywhere.
  • Improves safety, reduces travel, and enhances planning for tasks like maintenance, modifications, or turnarounds.

Commencing deployment with this Recommended Starter Dataset ensures fast delivery of a functional digital twin with tangible value. It lays out a scalable foundation that evolves not only as existing data quality improves but also unlocks additional data (e.g., inspection routines, control narratives, or real-time sensor feeds).

Note: this Recommended Starter Dataset is not static. As asset teams digitise more and more of its operational data, the digital twin evolves into a richer decision-making environment. Starting lean ensures we focus on what matters today while staying ready for tomorrow.

Recommended Datasets by Project Type

Not all data needs to be complete or available from day one. The table below outlines recommended datasets based on project type or lifecycle phase, helping ensure the digital twin is both meaningful and usable from the start. Priorities may vary depending on context, but this serves as a strong foundation for planning and collaboration.



+
= nice to have 

Further reading

This is just a brief introduction as part of our data activation series. You’ll find more information about how you can unlock the power of your data on our Data Activation homepage.