Repeating records with increment dates based on previous row value

bcariou · March 10, 2017, 1:11am

I want to generate records with row over row effective valid beginning date and effective end date dependencies
The first record would have an effective date of 1900/01/01 and an effective end date of a random date value.
Then each subsequent record would have an effective begin date that would be the effective end date from the previous record.
The last version of a row of a specific ID would have an effective end date of 9999/01/01.
The number of iterations for a given ID is variable.
File could look like that.
Is this something possible with Mokaroo?
Thanks

ID EFF_BEGIN_DATE EFF_END_DATE
1 1900/01/01 2015/02/18
1 2015/02/18 2016/05/12
1 2016/05/12 9999/01/01
2 1900/01/01 2012/03/13
2 2012/03/13 2014/05/12
2 2014/05/12 2015/01/23
2 2015/01/23 9999/01/01
…
…

mockaroo · March 10, 2017, 2:56am

I doubt it. At the very least it wouldn’t be easy without some serious trickery. There’s no way to access the previous record in Mockaroo because data generation is parallelized across multiple processes. You might be able to get there by doing this in two passes:

First, generate a dataset with two fields: row number and a random date:

row (Row Number)
date (Formula) date('1/1/2010') + months(row) + days(random(0,31))

Then upload this as a dataset, and in a new schema, use a formula like this to access the previous row given the current row:

current_row (Row Number)
start_date (Formula) from_dataset("dates", "date", row: current_row - 1)
end_date (Formula) from_dataset("dates", "date", row: current_row)

bcariou · March 10, 2017, 5:33pm

Thanks Mark. This part works. It solves one part of the problem. Maybe an addition to Mockaroo would be some global variables that can be set to store values to compare row other rows.

ajijacob · September 27, 2019, 8:29pm

Hi - I am trying this but throwing error. Is this feature removed?

Error evaluating from_dataset(“dates”, “date”, “{“row”:0}”): Dataset “dates” not found.

ajijacob · September 29, 2019, 2:41am

It worked but issue is that the 1st record is always blank. Also the employment dates of a person is not after every month,. it could be 1year or more years.

Is there something like to generate employment dates? Thanks

ajijacob · September 30, 2019, 4:00pm

When using it with json array, throwing this error:

error: Field ‘employ’ not found

ajijacob · September 30, 2019, 4:49pm

This works:
from_dataset(“from_dataset”, “dates”,row: field(‘employ.current_row’))

user95 · July 5, 2022, 7:57pm

I know this discussion is old, but I believe I can add to ajijacob’s comment surrounding “the first row will always be blank.” becased on the following line:
from_dataset(“dates”, “date”, row: current_row - 1)

What one can do in this aspect is define the “id” field, which is sequential, as the first row. From they you can utilize an ‘if’ statement.

if field(‘id’) == ‘1’ then
else from_dataset(“dates”, “date”, row: current_row - 1)
end

just a thought.