I want to create multiple versions of the same underlying test data.
Step 1 is easy - define the data in a schema.
Step 2 - I want to have output that is largely the same as Step 1 but with some rules applied so specific records are removed, some fields from original dataset are changed etc.
Then I want to repeat step 2 a few times.
What is the best approach?
I am thinking one schema for the original, then another schema (copy of the first) which reads in the original dataset, sets all values in the schema to those in the original dataset plus some variations applied through formulas.
Will this work or is there a better way?