onnxruntime Package installation in Python Plugin from Azure Data Explorer Fails

0

I want to install the onnxruntime package using the python plugin from the Azure Data Explorer. I followed the instructions from this site https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/pythonplugin?pivots=azuredataexplorer.

However, i dont get it to work properly. It always tells me "ImportError: cannot import name 'get_all_providers'"

Here is my kusto statement from the Azure Data Explorer:

let predict_onnx_fl=(samples:(*),  model_name:string, features_cols:dynamic, pred_col:string){
let kwargs = pack('features_cols', features_cols, 'pred_col', pred_col);
let code =
'\n'
'import pickle\n'
'import binascii\n'
'from sandbox_utils import Zipackage\n'
'Zipackage.install("onnx2.zip")\n'
'import onnxruntime as rt\n'
'\n'
'smodel = "08031208736b6c326f6e6e781a05312e372e30220761692e6f6e6e78280032003adb040af0010a0b666c6f61745f696e70757412056c6162656c121270726f626162696c6974795f74656e736f721a104c696e656172436c617373696669657222104c696e656172436c61737369666965722a210a13636c6173736c6162656c735f737472696e67734a026e6f4a03796573a001082a250a0c636f656666696369656e74733d637146c13d000000803d637146413d00000000a001062a190a0a696e74657263657074733d391eba433d391ebac3a001062a120a0b6d756c74695f636c6173731801a001022a1d0a0e706f73745f7472616e73666f726d22084c4f474953544943a001033a0a61692e6f6e6e782e6d6c0a560a1270726f626162696c6974795f74656e736f72120d70726f626162696c69746965731a0a4e6f726d616c697a6572220a4e6f726d616c697a65722a0d0a046e6f726d22024c31a001033a0a61692e6f6e6e782e6d6c0a2b0a056c6162656c120c6f75747075745f6c6162656c1a084964656e7469747922084964656e746974793a000a620a0d70726f626162696c697469657312126f75747075745f70726f626162696c6974791a065a69704d617022065a69704d61702a210a13636c6173736c6162656c735f737472696e67734a026e6f4a03796573a001083a0a61692e6f6e6e782e6d6c122039626230386630393965383234353034616662366161653835633236663437355a1b0a0b666c6f61745f696e707574120c0a0a080112060a000a02080262180a0c6f75747075745f6c6162656c12080a06080812020a0062240a126f75747075745f70726f626162696c697479120e220c0a0a2a08080812040a02080142040a001001420e0a0a61692e6f6e6e782e6d6c1001"\n'
'features_cols = kargs["features_cols"]\n'
'pred_col = kargs["pred_col"]\n'
'bmodel = binascii.unhexlify(smodel)\n'
'\n'
'features_cols = kargs["features_cols"]\n'
'pred_col = kargs["pred_col"]\n'
'\n'
'sess = rt.InferenceSession(bmodel)\n'
'input_name = sess.get_inputs()[0].name\n'
'label_name = sess.get_outputs()[0].name\n'
'df1 = df[features_cols]\n'
'predictions = sess.run([label_name], {input_name: df1.values.astype(np.float32)})[0]\n'
'\n'
'result = df\n'
'result[pred_col] = pd.DataFrame(predictions, columns=[pred_col])'
'\n'
;
samples | evaluate python(typeof(*), code, kwargs, external_artifacts=pack('onnx2.zip', 'LINK TO SANDBOX'))
};
RAW_TSM_ReadingValues
| take 1000
| extend pred_Occupancy=bool(0)
| invoke predict_onnx_fl('ONNX', pack_array('value', 'value_tariff'), 'pred_Occupancy')

And here is the Error Statement I get:

 Query execution has resulted in error (0x80131500): Partial query failure: 0x80131500 (message:
 'Encountered an error during execution of local sandbox. Error details: Python code execution failed 
 with the following error: ImportError: cannot import name 'get_all_providers'; Traceback (most 
 recent call last):
 File "C:\Enlistments\Kusto\Azure-Kusto- 
 Service\Src\Common\Kusto.Cloud.Platform.Sandbox\Languages\Python\execute_python.py", line 155, in 
 exec_python
 File "<string>", line 8, in <module>
 File "C:\Temp\onnxruntime\__init__.py", line 13, in <module>
 from onnxruntime.capi._pybind_state import get_all_providers, get_available_providers, get_device, 
 set_seed, \
 ImportError: cannot import name 'get_all_providers'
 .  ==> ExecutePluginOperator failure: ', details: 'Source: DataNode
 [0]Kusto.Cloud.Platform.Sandbox.Exceptions.SandboxExecutionException: Encountered an error during 
 execution of local sandbox. Error details: Python code execution failed with the following error: 
 ImportError: cannot import name 'get_all_providers'; Traceback (most recent call last):
   File "C:\Enlistments\Kusto\Azure-Kusto- 
Service\Src\Common\Kusto.Cloud.Platform.Sandbox\Languages\Python\execute_python.py", line 155, in 
 exec_python
   File "<string>", line 8, in <module>
   File "C:\Temp\onnxruntime\__init__.py", line 13, in <module>
     from onnxruntime.capi._pybind_state import get_all_providers, get_available_providers, 
 get_device, set_seed, \
 ImportError: cannot import name 'get_all_providers'
 .
 Timestamp=2020-09-17T10:31:20.4439778Z
 ClientRequestId=KustoWebV2;f7d0f61f-7b8c-4855-a5fa-a5a5370725ba
 ActivityId=fdee5c65-0d8f-47b0-bda8-9a26553eaf22
 ActivityType=DN.FE.ExecuteQuery
 ServiceAlias=DACHSADXCLUSTEREUW
 MachineName=KEngine000000
 ProcessName=Kusto.WinSvc.Svc
 ProcessId=5812
 ThreadId=332
 AppDomainName=Kusto.WinSvc.Svc.exe
 ActivityStack=(Activity stack: CRID=KustoWebV2;f7d0f61f-7b8c-4855-a5fa-a5a525ba ARID=fdee5-0d8f- 
47b0-bda8-9a2655af22 > DN.FE.ExecuteQuery/ff6edaa-e90-4050-b5c9-6f7a59abd6)

 ErrorCode=
 ErrorReason=
 ErrorMessage=
 DataSource=
 DatabaseName=
 ClientRequestId=
 ActivityId=00000000-0000-0000-0000-000000000000
 Details=Python code execution failed with the following error: ImportError: cannot import name 
 'get_all_providers'; Traceback (most recent call last):
   File "C:\Enlistments\Kusto\Azure-Kusto- 
Service\Src\Common\Kusto.Cloud.Platform.Sandbox\Languages\Python\execute_python.py", line 155, in 
 exec_python
   File "<string>", line 8, in <module>
   File "C:\Temp\onnxruntime\__init__.py", line 13, in <module>
     from onnxruntime.capi._pybind_state import get_all_providers, get_available_providers, 
 get_device, set_seed, \
 ImportError: cannot import name 'get_all_providers'

    at Kusto.DataNode.QueryService.PluginsV2.SandboxedPluginBase.ExecuteInSandbox(SandboxKind 
 sandboxKind, ISandboxManager sandboxManager, ClientRequestProperties clientRequestProperties, 
 IDictionary`2 argumentsPropertyBag, IStreamSource inputStreamSource, IDictionary`2 
 externalArtifacts, OperationStatistics& operationStatistics) in 
 C:\source\Src\Engine\DataNode\QueryService\PluginsV2\DistributedPlugins\SandboxedPluginBase.cs:line 56
    at 
 Kusto.DataNode.QueryService.PluginsV2.ScriptExecutionPluginBase.Execute(PluginDistributionCapsule 
 distributionCapsule, ClientRequestProperties clientRequestProperties, IStreamSource 
 inputStreamSource, OperationStatistics& operationStatistics) in C:\source\Src\Engine\DataNode\QueryService\PluginsV2\DistributedPlugins\Languages\ScriptExecutionPluginBase.cs:line 144
    at Kusto.DataNode.DataEngineQueryPlan.DataEngineQueryProcessor.DataEngineQueryCallback.ExecutePluginOperator(String pluginName, DataSourceStreamFormat inputStreamFormat, DataSourceStreamFormat outputStreamFormat, String pluginSerializedContext, String serializedQueryContextProperties, IStreamSource inputStream, OperationStatistics& operationStatistics) in C:\source\Src\Engine\DataNode\QueryService\DataEngineQueryPlan\DataEngineQueryProcessor.cs:line 455').

  clientRequestId: KustoWebV2;f7d0f61f-7bc-4855-a5fa-a5a53ba

If anyone has an idea what the problem could be i would be really grateful!

python
package
sandbox
azure-data-explorer
onnxruntime
asked on Stack Overflow Sep 17, 2020 by Torb

1 Answer

2

We decided to add the onnxruntime package to the common image, avoiding issues like the one you faced. This update already started to roll out into production clusters, and is expected to be available in 1-2 weeks (cluster dependent). Once it's deployed you could use it as explained in the new function predict_onnx_fl(). thanks, Adi

answered on Stack Overflow Sep 17, 2020 by Adi E • edited Sep 18, 2020 by Yoni

User contributions licensed under CC BY-SA 3.0