Database
Subsetting
Reduce records to create a smaller, representative subset of a relational database while maintaining referential integrity
Key benefits of using Subsetting
Reduce infrastructure and computational costs
Excessive data volumes can lead to high infrastructure and computation costs, which are unnecessary for test data in non-production environments. With subsetting capabilities, you can easily create smaller subsets of your data to reduce your costs.
Manageable test data by testers and developers
Managing huge data volumes in non-production environments poses challenges for testers and developers. Smaller and thereby more manageable test data, significantly streamlining testing and development processes.
Simplify test data management for faster setup and maintenance
Smaller data volumes facilitate faster and more straightforward setup and maintenance of non-production test environments. This is particularly relevant in complex IT landscapes.
Enable secure testing, development, and training environments
By working with smaller, representative subsets of data, organizations can establish secure environments for testing, development, and training. This minimizes the risk of exposing sensitive information.
Subsetting steps
Configure Table Settings
Include or Exclude tables for subsetting.
Adjust Rows to Generate
Define the row count in the Rows to generate field, where Synthesize creates rows using AI, Duplicate samples rows from the source, and Exclude skip generating rows.

Frequently Asked Questions
Many organizations have production environments with massive amounts of data and do not want massive amounts of data in non-production test environments. Hence, database subsetting is used to create a smaller, representative subset of a larger relational database with preserved referential integrity. Organizations utilize subsetting for test data to reduce costs, to make it manageable and for faster setup and maintenance.
Referential integrity is a concept in database management that ensures consistency and accuracy between tables in a relational database. Referential integrity would ensure that every value that corresponds to "Person 1" of "Table 1" corresponds to the correct value of "person 1" in "Table 2" and any other linked table. Enforcing referential integrity is crucial for maintaining the reliability of test data in a relational database as part of non-production environments.
Real data problematic?
Turn to synthetic data!
Explore with us how to create data that mimics real data,
safely and efficiently, using synthetic data