cancel
Showing results for 
Search instead for 
Did you mean: 

Databricks API JSON Result to Collection

naveed_raza
Level 7

Hi All,

Does anybody get a chance to work on Databricks API ?

Databricks API result comes in JSON and the JSON format is different compare to other API results as here in Databricks API we are getting columns in different array of object and actual data in different array of object.

What is the best way to convert this into proper data table (collection) with headers ?

Example: 

{
"schema": {
   "columns": 
      [
        {
          "name": "headername",
          "type": "string"
        },
        {
          "name": "headername2",
          "type": "date"
        }
      ]
  },
 "result": 
   {
     "data_array":[
           [
             "row1header1value",
             "row1header2value"
           ],
           [
             "row2header1value",
             "row2header2value"
           ]
         ]
   }
}

 #API, #databricks, #json

1 BEST ANSWER

Helpful Answers

naveed_raza
Level 7

finally i have written the code to convert into table format,

here is sample Databricks API JSON output ( i have removed un wanted columns from JSON and kept only required one)

{
    "manifest": {
        "schema": {
            "columns":
            [{
                    "name": "headername",
                    "type_name": "string"
                }, {
                    "name": "headername2",
                    "type_name": "string"
                }
            ]
        }
    },
    "result": {
        "data_array": [
            [
                "row1header1value",
                "row1header2value"
            ],
            [
                "row2header1value",
                "row2header2value"
            ]
        ]
    }
}

 here is my vb.net code, which i have written as a Global Code and using code staging calling this code by passing entire Databricks API JSON string as an input , and in output fetching Datatable and record count.

Note: make sure to add all necessary DLLs and namespace

Public Function ConvertDatabricksResultToDataTable(jsonStr As String) As DataTable

	Dim dt As New DataTable()

 ' Parse the JSON
	Dim jObj = JObject.Parse(jsonStr)

 ' Get Columns from manifest.schema.columns and adding to data table
	 Dim cols = jObj("manifest")("schema")("columns")

	 For Each col In cols
		 Dim colName As String = col("name").ToString()
		 Dim colType As String = col("type_name").ToString()

		' map Databricks type to VB.NET type
		 Select Case colType.ToUpper()
			Case "LONG", "BIGINT"
				dt.Columns.Add(colName, GetType(Long))
			Case "INTEGER", "INT"
				dt.Columns.Add(colName, GetType(Integer))
			Case "DECIMAL", "DOUBLE", "FLOAT"
				dt.Columns.Add(colName, GetType(Decimal))
			Case "DATE", "TIMESTAMP", "DATETIME"
				dt.Columns.Add(colName, GetType(Date))
			Case "BOOLEAN", "BOOL"
				dt.Columns.Add(colName, GetType(Boolean))
			Case Else
				dt.Columns.Add(colName, GetType(String))
		 End Select
	 Next

 ' Looping each record and addings rows from result.data_array
	Dim rows = jObj("result")("data_array")

	For Each r In rows
		Dim arr As New List(Of Object)

		For i = 0 To dt.Columns.Count - 1
			Dim value = r(i).ToString()

			' Convert data types per column
			Dim colType = dt.Columns(i).DataType

			If colType Is GetType(Long) Then
				arr.Add(Long.Parse(value))
			ElseIf colType Is GetType(Integer) Then
				arr.Add(Integer.Parse(value))
			ElseIf colType Is GetType(Decimal) Then
				arr.Add(Decimal.Parse(value))
			ElseIf colType Is GetType(Date) Then
				Dim d as Date
				If Date.TryParseExact(value,{"yyyy-MM-dd","yyyy-MM-ddTHH:mm:ss","yyyy-MM-ddTHH:mm:ssZ"},System.Globalization.CultureInfo.InvariantCulture,Globalization.DateTimeStyles.None,d) then
					arr.Add(d)
				else
					arr.Add(DBNull.Value)
				End if
			ElseIf colType Is GetType(Boolean) Then
				Select Case value.ToString().Trim().ToLower()
					Case "true", "1", "yes"
						arr.Add(True)
					Case "false", "0", "no"
						arr.Add(False)
					Case Else
						arr.Add(DBNull.Value)
				End Select
			Else
				arr.Add(value) ' fallback string
			End If
		Next

		dt.Rows.Add(arr.ToArray())
	Next

	Return dt

End Function

 Please do let us know, if this solution is best to go or can we do it in any other better way. Hope this will help for others also.

 

Regards,

Mohammad Naveed

View answer in original post

7 REPLIES 7

naveed_raza
Level 7

i reviewed other options like OLEDB and ODBC,

Looks like using OLEDB we can't connect to Databricks, and using ODBC , we have to do some work around and for that looks like we have install Databricks odbc driver and do setup in bots to make this callable so that will get data in proper table format.

by seeing this , i have to go with Databricks API only and have to found a way to convert that JSON in proper table format

naveed_raza
Level 7

i thought of storing columns collection and data collection separately and doing loop for data , but thinking looping is not the good idea here.

i am writing VB.Net code to convert this into proper table format, will share the code here once its done

Hi @naveed_raza 

Have you tried Utility - JSON VBO->JSON to Collection 

Best Regards,

Sayeed Bin Abdullah 

naveed_raza
Level 7

Using JSON Utility , it convert in table, however one column we have nested collection and another column will have data collection.

One Table will have 2 columns , and each column will have nested collection, headers in one column, actual data in another

naveed_raza
Level 7

@SayeedBinAbdullah - this is how its showing , if we use JSON utility 

schema column will have columns , and result column will have actual data row. i am done with coding, testing , will share the code shortly.

 

naveed_raza_0-1765309734727.png

naveed_raza_2-1765309850239.png

naveed_raza_3-1765309878077.png

 

 

naveed_raza
Level 7

finally i have written the code to convert into table format,

here is sample Databricks API JSON output ( i have removed un wanted columns from JSON and kept only required one)

{
    "manifest": {
        "schema": {
            "columns":
            [{
                    "name": "headername",
                    "type_name": "string"
                }, {
                    "name": "headername2",
                    "type_name": "string"
                }
            ]
        }
    },
    "result": {
        "data_array": [
            [
                "row1header1value",
                "row1header2value"
            ],
            [
                "row2header1value",
                "row2header2value"
            ]
        ]
    }
}

 here is my vb.net code, which i have written as a Global Code and using code staging calling this code by passing entire Databricks API JSON string as an input , and in output fetching Datatable and record count.

Note: make sure to add all necessary DLLs and namespace

Public Function ConvertDatabricksResultToDataTable(jsonStr As String) As DataTable

	Dim dt As New DataTable()

 ' Parse the JSON
	Dim jObj = JObject.Parse(jsonStr)

 ' Get Columns from manifest.schema.columns and adding to data table
	 Dim cols = jObj("manifest")("schema")("columns")

	 For Each col In cols
		 Dim colName As String = col("name").ToString()
		 Dim colType As String = col("type_name").ToString()

		' map Databricks type to VB.NET type
		 Select Case colType.ToUpper()
			Case "LONG", "BIGINT"
				dt.Columns.Add(colName, GetType(Long))
			Case "INTEGER", "INT"
				dt.Columns.Add(colName, GetType(Integer))
			Case "DECIMAL", "DOUBLE", "FLOAT"
				dt.Columns.Add(colName, GetType(Decimal))
			Case "DATE", "TIMESTAMP", "DATETIME"
				dt.Columns.Add(colName, GetType(Date))
			Case "BOOLEAN", "BOOL"
				dt.Columns.Add(colName, GetType(Boolean))
			Case Else
				dt.Columns.Add(colName, GetType(String))
		 End Select
	 Next

 ' Looping each record and addings rows from result.data_array
	Dim rows = jObj("result")("data_array")

	For Each r In rows
		Dim arr As New List(Of Object)

		For i = 0 To dt.Columns.Count - 1
			Dim value = r(i).ToString()

			' Convert data types per column
			Dim colType = dt.Columns(i).DataType

			If colType Is GetType(Long) Then
				arr.Add(Long.Parse(value))
			ElseIf colType Is GetType(Integer) Then
				arr.Add(Integer.Parse(value))
			ElseIf colType Is GetType(Decimal) Then
				arr.Add(Decimal.Parse(value))
			ElseIf colType Is GetType(Date) Then
				Dim d as Date
				If Date.TryParseExact(value,{"yyyy-MM-dd","yyyy-MM-ddTHH:mm:ss","yyyy-MM-ddTHH:mm:ssZ"},System.Globalization.CultureInfo.InvariantCulture,Globalization.DateTimeStyles.None,d) then
					arr.Add(d)
				else
					arr.Add(DBNull.Value)
				End if
			ElseIf colType Is GetType(Boolean) Then
				Select Case value.ToString().Trim().ToLower()
					Case "true", "1", "yes"
						arr.Add(True)
					Case "false", "0", "no"
						arr.Add(False)
					Case Else
						arr.Add(DBNull.Value)
				End Select
			Else
				arr.Add(value) ' fallback string
			End If
		Next

		dt.Rows.Add(arr.ToArray())
	Next

	Return dt

End Function

 Please do let us know, if this solution is best to go or can we do it in any other better way. Hope this will help for others also.

 

Regards,

Mohammad Naveed

Hi Naveed,

Can you please list down all the dll's required?