Monday - last edited Monday
Hi All,
Does anybody get a chance to work on Databricks API ?
Databricks API result comes in JSON and the JSON format is different compare to other API results as here in Databricks API we are getting columns in different array of object and actual data in different array of object.
What is the best way to convert this into proper data table (collection) with headers ?
Example:
{
"schema": {
"columns":
[
{
"name": "headername",
"type": "string"
},
{
"name": "headername2",
"type": "date"
}
]
},
"result":
{
"data_array":[
[
"row1header1value",
"row1header2value"
],
[
"row2header1value",
"row2header2value"
]
]
}
}#API, #databricks, #json
Answered! Go to Answer.
yesterday
finally i have written the code to convert into table format,
here is sample Databricks API JSON output ( i have removed un wanted columns from JSON and kept only required one)
{
"manifest": {
"schema": {
"columns":
[{
"name": "headername",
"type_name": "string"
}, {
"name": "headername2",
"type_name": "string"
}
]
}
},
"result": {
"data_array": [
[
"row1header1value",
"row1header2value"
],
[
"row2header1value",
"row2header2value"
]
]
}
}here is my vb.net code, which i have written as a Global Code and using code staging calling this code by passing entire Databricks API JSON string as an input , and in output fetching Datatable and record count.
Note: make sure to add all necessary DLLs and namespace
Public Function ConvertDatabricksResultToDataTable(jsonStr As String) As DataTable
Dim dt As New DataTable()
' Parse the JSON
Dim jObj = JObject.Parse(jsonStr)
' Get Columns from manifest.schema.columns and adding to data table
Dim cols = jObj("manifest")("schema")("columns")
For Each col In cols
Dim colName As String = col("name").ToString()
Dim colType As String = col("type_name").ToString()
' map Databricks type to VB.NET type
Select Case colType.ToUpper()
Case "LONG", "BIGINT"
dt.Columns.Add(colName, GetType(Long))
Case "INTEGER", "INT"
dt.Columns.Add(colName, GetType(Integer))
Case "DECIMAL", "DOUBLE", "FLOAT"
dt.Columns.Add(colName, GetType(Decimal))
Case "DATE", "TIMESTAMP", "DATETIME"
dt.Columns.Add(colName, GetType(Date))
Case "BOOLEAN", "BOOL"
dt.Columns.Add(colName, GetType(Boolean))
Case Else
dt.Columns.Add(colName, GetType(String))
End Select
Next
' Looping each record and addings rows from result.data_array
Dim rows = jObj("result")("data_array")
For Each r In rows
Dim arr As New List(Of Object)
For i = 0 To dt.Columns.Count - 1
Dim value = r(i).ToString()
' Convert data types per column
Dim colType = dt.Columns(i).DataType
If colType Is GetType(Long) Then
arr.Add(Long.Parse(value))
ElseIf colType Is GetType(Integer) Then
arr.Add(Integer.Parse(value))
ElseIf colType Is GetType(Decimal) Then
arr.Add(Decimal.Parse(value))
ElseIf colType Is GetType(Date) Then
Dim d as Date
If Date.TryParseExact(value,{"yyyy-MM-dd","yyyy-MM-ddTHH:mm:ss","yyyy-MM-ddTHH:mm:ssZ"},System.Globalization.CultureInfo.InvariantCulture,Globalization.DateTimeStyles.None,d) then
arr.Add(d)
else
arr.Add(DBNull.Value)
End if
ElseIf colType Is GetType(Boolean) Then
Select Case value.ToString().Trim().ToLower()
Case "true", "1", "yes"
arr.Add(True)
Case "false", "0", "no"
arr.Add(False)
Case Else
arr.Add(DBNull.Value)
End Select
Else
arr.Add(value) ' fallback string
End If
Next
dt.Rows.Add(arr.ToArray())
Next
Return dt
End FunctionPlease do let us know, if this solution is best to go or can we do it in any other better way. Hope this will help for others also.
Regards,
Mohammad Naveed
Monday
i reviewed other options like OLEDB and ODBC,
Looks like using OLEDB we can't connect to Databricks, and using ODBC , we have to do some work around and for that looks like we have install Databricks odbc driver and do setup in bots to make this callable so that will get data in proper table format.
by seeing this , i have to go with Databricks API only and have to found a way to convert that JSON in proper table format
yesterday
i thought of storing columns collection and data collection separately and doing loop for data , but thinking looping is not the good idea here.
i am writing VB.Net code to convert this into proper table format, will share the code here once its done
yesterday
Hi @naveed_raza
Have you tried Utility - JSON VBO->JSON to Collection
Best Regards,
Sayeed Bin Abdullah
yesterday
Using JSON Utility , it convert in table, however one column we have nested collection and another column will have data collection.
One Table will have 2 columns , and each column will have nested collection, headers in one column, actual data in another
yesterday - last edited yesterday
@SayeedBinAbdullah - this is how its showing , if we use JSON utility
schema column will have columns , and result column will have actual data row. i am done with coding, testing , will share the code shortly.
yesterday
finally i have written the code to convert into table format,
here is sample Databricks API JSON output ( i have removed un wanted columns from JSON and kept only required one)
{
"manifest": {
"schema": {
"columns":
[{
"name": "headername",
"type_name": "string"
}, {
"name": "headername2",
"type_name": "string"
}
]
}
},
"result": {
"data_array": [
[
"row1header1value",
"row1header2value"
],
[
"row2header1value",
"row2header2value"
]
]
}
}here is my vb.net code, which i have written as a Global Code and using code staging calling this code by passing entire Databricks API JSON string as an input , and in output fetching Datatable and record count.
Note: make sure to add all necessary DLLs and namespace
Public Function ConvertDatabricksResultToDataTable(jsonStr As String) As DataTable
Dim dt As New DataTable()
' Parse the JSON
Dim jObj = JObject.Parse(jsonStr)
' Get Columns from manifest.schema.columns and adding to data table
Dim cols = jObj("manifest")("schema")("columns")
For Each col In cols
Dim colName As String = col("name").ToString()
Dim colType As String = col("type_name").ToString()
' map Databricks type to VB.NET type
Select Case colType.ToUpper()
Case "LONG", "BIGINT"
dt.Columns.Add(colName, GetType(Long))
Case "INTEGER", "INT"
dt.Columns.Add(colName, GetType(Integer))
Case "DECIMAL", "DOUBLE", "FLOAT"
dt.Columns.Add(colName, GetType(Decimal))
Case "DATE", "TIMESTAMP", "DATETIME"
dt.Columns.Add(colName, GetType(Date))
Case "BOOLEAN", "BOOL"
dt.Columns.Add(colName, GetType(Boolean))
Case Else
dt.Columns.Add(colName, GetType(String))
End Select
Next
' Looping each record and addings rows from result.data_array
Dim rows = jObj("result")("data_array")
For Each r In rows
Dim arr As New List(Of Object)
For i = 0 To dt.Columns.Count - 1
Dim value = r(i).ToString()
' Convert data types per column
Dim colType = dt.Columns(i).DataType
If colType Is GetType(Long) Then
arr.Add(Long.Parse(value))
ElseIf colType Is GetType(Integer) Then
arr.Add(Integer.Parse(value))
ElseIf colType Is GetType(Decimal) Then
arr.Add(Decimal.Parse(value))
ElseIf colType Is GetType(Date) Then
Dim d as Date
If Date.TryParseExact(value,{"yyyy-MM-dd","yyyy-MM-ddTHH:mm:ss","yyyy-MM-ddTHH:mm:ssZ"},System.Globalization.CultureInfo.InvariantCulture,Globalization.DateTimeStyles.None,d) then
arr.Add(d)
else
arr.Add(DBNull.Value)
End if
ElseIf colType Is GetType(Boolean) Then
Select Case value.ToString().Trim().ToLower()
Case "true", "1", "yes"
arr.Add(True)
Case "false", "0", "no"
arr.Add(False)
Case Else
arr.Add(DBNull.Value)
End Select
Else
arr.Add(value) ' fallback string
End If
Next
dt.Rows.Add(arr.ToArray())
Next
Return dt
End FunctionPlease do let us know, if this solution is best to go or can we do it in any other better way. Hope this will help for others also.
Regards,
Mohammad Naveed
4 hours ago
Hi Naveed,
Can you please list down all the dll's required?