22-08-24 07:46 AM
Hi all,
When receiving the response from a BigQuery.jobs query using Skills, I would like to have the data structured in a collection. However, I'm having trouble achieving this and always end up with rows that require looping to unnest.
The entire response will consist of upwards of 100,000 records, so having to loop each row to unnest is not an ideal solution in this case.
I've tried using the updated Function for Utility - JSON - 10.0.1 on DX and also attempted the solution provided by Eric: https://community.blueprism.com/t5/Product-Forum/Web-Services-Json-Path/td-p/73574
However, it doesn't seem to work in our case. Additionally, I've tried the custom Business Object from Symphony Ventures: Ventures: https://digitalexchange.blueprism.com/dx/entry/7888/solution/json-manipulation-c
The response looks like this (with sensitive information removed):
{
"kind": "bigquery#queryResponse",
"schema": {
"fields": [
{
"name": "json",
"type": "STRING",
"mode": "NULLABLE"
}
]
},
"jobReference": {
"projectId": "production",
"jobId": "XXXXXXX",
"location": "europe"
},
"totalRows": "41158",
"pageToken": "123456789xxx======",
"rows": [
{
"f": [
{
"v": "{\"id\":\"A1\",\"codes\":\"123,234\"}"
}
]
},
{
"f": [
{
"v": "{\"id\":\"A2\",\"codes\":\"123\"}"
}
]
}
],
"totalBytesProcessed": "403783338",
"jobComplete": true,
"cacheHit": false,
"queryId": "job_xxx",
"jobCreationReason": {
"code": "REQUESTED"
}
}
I would like to end up with the data structured in a collection, similar to this:
Thank you for any help on this matter.
Best regards,
Nicolai
Answered! Go to Answer.
23-08-24 11:32 AM - edited 23-08-24 11:34 AM
Hey Hi @nicolailykke
Here you go.....
I have used the Utility Strings object(Latest version - Blue Prism Digital Exchange), where we have Utility - Strings::Extract Regex All Matches
Regex Pattern: "v":\s*"\{\\\"id\\\":\\\"([^\\\"]+)\\\",\\\"codes\\\":\\\"([^\\\"]+)\\\"\}"
Hope this helps your problem 🙂
Thanks,
Nandha
22-08-24 10:20 AM
You might want to try the following approach:
Assume you have a collection named "Response." Within this collection, there's a JSON array column that contains another collection.
To work with this, use a calculation stage. In the calculation stage, reference "[Response.JsonArray]" and store the result in a new collection called "Results." After doing this, the "Results" collection will contain only the rows from the JSON array column in the "Response" collection.
22-08-24 10:55 AM
Hi @nicolailykke ,
You can try to use the JSON to Collection (skipping JArray) action in Utility JSON to skip the JArray and get the required data rows in collection instead of nested collections.
Hope this helps 🙂
22-08-24 11:11 AM
Hi Parthiban,
Thank you for the answer. Do you have a example of doing this without having to loop through the initial collection?
I can't get that to work without looping which is the issue.
Best regards,
Nicolai
22-08-24 11:14 AM
Hi Nandhakumar,
Thank you for the answer.
I have used the action from the utillity but I still have to loop in order to unnest.
I have provided an example of the response in my original post. Can you make that to work with the JSON to Collection (skipping JArray) action? I can't.
Best regards,
Nicolai
22-08-24 01:26 PM
there are 3 simple ways you can do this, using the Utility - Collection Manipulation Extended VBO, firstly you create a loop for the main collection and use the 'Filter Collection' action using the embedded collection as the main collection and leaving the filter blank. This will then output all rows to a new collection as required (see screens below), the drawback here is if you need to put all collections into a single output collection then the next filter would delete the previous data.
The second option is to use the 'Append rows to collection' action and similarly you just need to ad the embedded collection as the main collection and the appended collection is you output collection (see screens below). The drawback on this one is that you will need to know the fields and have them entered in the output collection before you start.
The third option is probably the preferred option which combines filtered and appended actions to get all collections into a single output collection. This will have to count the rows and if its row1, first embedded collection, then it will filter this to the output collection. If its not the first row/collection then it will append the collections with the output collection. The reason this is the best of the 3 is that the filter will create the fields for you from the first collection so you wont need to know these beforehand and then append all additional collections to the output, provided they all share the same fields.
Hope this helps 🙂
22-08-24 03:28 PM
Hi @nicolailykke ,
May i check about your JSON data? whether it is single record related info which has nested information or you may get a JSON data which may have multiple records?
Assuming that it is single record then i have followed below steps to not to use the loop with you main collection, let me know if this suits your issue.
Use JSON utility - Skip JArray action with the JSON Data you have provided
Then use a calculation stage to assign the rows (Collection) in the above into a new collection.
The use a loop to iterate through you intended output rows, where i can see it is still Json data then you can again use the utility to convert to collection and then append the rows to a final collection.
Hope this helps.
Incase if your initial Json data has multiple records like below then you need to loop the initial collection.
23-08-24 08:28 AM
Hi Michael,
Thank you for your solution 🙂 However, I am seeking a solution where I can avoid the looping part.
I'm not sure if such a solution exists, but I end up having to loop through upwards of 100,000 rows, and that would take many hours each time the process is run daily. On our local test environment, it takes over 6 hours (when not run from the control room).
I am using this skill:
And in the response content (as shown in the original post), I end up with many thousands of arrays in rows, like this, in the response.
{
"f": [
{
"v": "{\"id\":\"A1\",\"codes\":\"123,234\"}"
}
]
},
23-08-24 08:33 AM
Hi Nandhakumar,
Thank you very much for taking the time to post a solution.
Your solution does work, but it takes over 6 hours to loop through the rows on my test machine when running with our production data. I was hoping to find an alternative approach (if one exists) that might be more efficient.
23-08-24 11:32 AM - edited 23-08-24 11:34 AM
Hey Hi @nicolailykke
Here you go.....
I have used the Utility Strings object(Latest version - Blue Prism Digital Exchange), where we have Utility - Strings::Extract Regex All Matches
Regex Pattern: "v":\s*"\{\\\"id\\\":\\\"([^\\\"]+)\\\",\\\"codes\\\":\\\"([^\\\"]+)\\\"\}"
Hope this helps your problem 🙂
Thanks,
Nandha