From_dataset: select data by comparing 2 mongo ids

I’m fairly new to mockaroo, but I am being stumped by this one.

I have a generated dataset that contains a mongo id, that I’m using as a primary key and a second mongo id that I want to perform a lookup against.
Here is the formula:
from_dataset(“license”, “_id”, driverID: userID)

Getting this error:
“Error evaluating from_dataset(“license”, “driverID”, “{“driverID”:{”$oid”:“66435f59e10537ef2d0d1b6e”}}"): Empty: no JSON found"

If I change the formula to : from_dataset(“license”, “_id”, driverID => userID)
I get this error: “error: Field ‘driverID’ not found”

I can pull random records from the license dataset, using the column type "dataset column’, but in this use case I need to find a match.

license dataset sample:
{"_id":{"$oid":“664f9430e10537f0a3679abd”},“driverID”:{"$oid":“66435f59e10537f0a3679974”},“status”:“active”,“issueState”:“FL”,“number”:“ER92102271B247”}

driver dataset sample:
{"_id":{"$oid":“66435f59e10537f0a3679974”},“userID”:{"$oid":“66435f59e10537f0a3679974”},“status”:“active”,“startDate”:{"$date":“2024-07-04T14:55:56.000Z”},“yearsDriving”:12,“rates”:[{"$oid":“664b6e60e10537f11e8f5056”}],“marketingDesc”:“Morbi non lectus. Aliquam sit amet diam in magna bibendum imperdiet. Nullam orci pede, venenatis non, sodales sed, tincidunt eu, felis.”,“license”:“Error evaluating from_dataset(“license”, “_id”, “{”{\”$oid\"=\u003e\“driverID\”}":{"$oid":{"$oid":“66435f59e10537f0a3679974”}}}"): Empty: no JSON found"}

Any direction would be greatly appreciated

I have simplified the schemas and shared them, see below.
https://mockaroo.com/010d01c0
https://mockaroo.com/8685de70
Mockaroo - Random Data Generator and API Mocking Tool | JSON / CSV / SQL / Excel

After doing some more testing, it appears that if the lookup dataset contains JSON data, then the error occurs.

How did you manage to get the JSON data into your dataset? I am only able to get csv data up there. I know the dataset FAQ says it works with JSON, but I have not been able to load that format. I always get a parse error asking if the file was UTF8 CSV. I tried different extensions (txt, csv, json), same thing each time.

I am not able to run your first example because I don’t have access to your lookup dataset and the samples you posted just give errors when I attempt to load them as mentioned above.

That said, I did the following:

  1. Loaded a csv file containing this:
second_col,first_col
testing,"{""id"":""first""}"

Note the 2 double-quotes where a single double-quote normally would go and that the JSON object is surrounded by quotes.

  1. With the csv file I posted above, the following two hard-coded criteria values worked for me;
from_dataset("license", "first_col", second_col: "testing")
from_dataset("license", "second_col", first_col: "{\"id\":\"first\"}")

As well as referencing the values via a field reference (the “id” field in the last row criteria is the “id” field on the first row):

This works only with the entire JSON object as the search value - I’m not able to parse into it either.

If you could tell me how you got the json object up as a dataset, I could play around some more, but otherwise, this is the best I could do.

Also, using “=>” never worked for me. I always had to use ": " in the criteria.

I created the schema and utilized the “.” notation to create the JSON object.

I shared 3 schemas which shows the issue in a cleaner format. I’ll take a look and try your suggestion.
Here is an example of lookup_error:
{
“id”: 1,
“first_name”: “Janie”,
“last_name”: “Dawber”,
“email”: “jdawber0@icq.com”,
“addr”: {
“line1”: “8 Merry Parkway”,
“state”: “FL”
}
}

Here is an example of lookup:
{
“id”: 1,
“first_name”: “Ezequiel”,
“last_name”: “Stitson”,
“email”: “estitson0@blogspot.com”,
“line1”: “1 Ridgeview Pass”,
“state”: “FL”
}

I’m also looking to have the JSON field, in the left-hand side of the criteria
Thanks

I am able to create the JSON data. But when I try to upload it to a dataset, so I can reference it via the from_dataset function you’re having trouble with, I get the error I noted. You were able to upload the generated JSON file exactly as Mockaroo created it, without modification and using the “Upload a Dataset” button on the My Datasets page?

Uploading a dataset, can only be done using a csv file.
What I did: save the generated dataset as a CSV file. I then modified the from_dataset formula to use the CSV version of the dataset. For Example: from_dataset(“lookupErrorv2”,“id”, “addr.state”: “FL”). I found that the ‘JSON’ data in the csv had to be quoted. in order for the data to be returned.
See the attached csv:
id,first_name,last_name,email,addr.line1,addr.state
1,Konstantine,Blannin,kblannin0@amazon.co.uk,1 Cody Junction,FL
2,Zarla,Ilchuk,zilchuk1@dmoz.org,03 Bluejay Center,FL
3,Briana,Koene,bkoene2@tinypic.com,5 Ilene Point,FL
4,Clem,Gofford,cgofford3@geocities.com,7 Glacier Hill Avenue,FL
5,Bamby,Freeman,bbertomieu4@issuu.com,0586 Utah Avenue,FL

This is related to From_dataset() doesn’t work with json datasets - Mockaroo Community Forum.

It would be great if the from_dataset would work on JSON datasets and not just CSV.

Any chance on making this happen? @mockaroo