Service Fabric Cluster Deploy Fails

2

I'm having a problem deploying a cluster to Azure. I'm using the template provided through Visual Studio (2017) as described here, securing it with a server/cluster certificate as described here.

I'm deploying via Visual Studio and the template seems to deploy successfully without any errors. However, when looking at the cluster in the portal it gets stuck in the "Deploying" state with no nodes appearing. RDP:ing into individual nodes and looking in the event viewer (Windows Logs/System) revelas that the Azure Service Fabric Node Bootstrap Agent service is stuck in loop, starting/stopping seemingly indefinitely.

Looking under Windows Logs/Application i can see the following (4) errors/warnings repeated for each restart attempt:

Failed starting service, Error: System.ArgumentNullException: Value cannot be null. Parameter name: path at System.IO.Path.GetFullPathInternal(String path) at Microsoft.Azure.ServiceFabric.Extension.Core.SetupHelper.ConfigNode(Byte[] clusterManifest, String nodeTypeRef, String machineName, String ipAddress, String faultDomain, String upgradeDomain, String dataRoot) at Microsoft.Azure.ServiceFabric.Extension.Core.NodeBootstrapAgent.TryConfigNode(RuntimeCluster clusterConfig, NodeDescription nodeDescription) at Microsoft.Azure.ServiceFabric.Extension.Core.NodeBootstrapAgent.StartFabricHostService(Boolean isBootstrapping)

ERROR: System.ArgumentNullException: Value cannot be null. Parameter name: path at System.IO.Path.GetFullPathInternal(String path) at Microsoft.Azure.ServiceFabric.Extension.Core.SetupHelper.ConfigNode(Byte[] clusterManifest, String nodeTypeRef, String machineName, String ipAddress, String faultDomain, String upgradeDomain, String dataRoot) at Microsoft.Azure.ServiceFabric.Extension.Core.NodeBootstrapAgent.TryConfigNode(RuntimeCluster clusterConfig, NodeDescription nodeDescription) at Microsoft.Azure.ServiceFabric.Extension.Core.NodeBootstrapAgent.StartFabricHostService(Boolean isBootstrapping) at Microsoft.Azure.ServiceFabric.Extension.Core.NodeBootstrapAgent.d__d.MoveNext() --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Microsoft.Azure.ServiceFabric.Extension.Core.NodeBootstrapAgent.d__0.MoveNext()

Application: ServiceFabricNodeBootstrapAgent.exe Framework Version: v4.0.30319 Description: The process was terminated due to an unhandled exception. Exception Info: System.ArgumentNullException at System.IO.Path.GetFullPathInternal(System.String) at Microsoft.Azure.ServiceFabric.Extension.Core.SetupHelper.ConfigNode(Byte[], System.String, System.String, System.String, System.String, System.String, System.String) at Microsoft.Azure.ServiceFabric.Extension.Core.NodeBootstrapAgent.TryConfigNode(Microsoft.Azure.ServiceFabric.Extension.Core.RuntimeCluster, Microsoft.Azure.ServiceFabric.Extension.Core.NodeDescription) at Microsoft.Azure.ServiceFabric.Extension.Core.NodeBootstrapAgent.StartFabricHostService(Boolean) at Microsoft.Azure.ServiceFabric.Extension.Core.NodeBootstrapAgent+d__d.MoveNext() at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(System.Threading.Tasks.Task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task) at Microsoft.Azure.ServiceFabric.Extension.Core.NodeBootstrapAgent+d__0.MoveNext() at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(System.Threading.Tasks.Task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task) at Microsoft.Azure.ServiceFabric.Extension.Service.Service+d__0.MoveNext() at System.Runtime.CompilerServices.AsyncMethodBuilderCore+<>c.b__6_1(System.Object) at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) at System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem() at System.Threading.ThreadPoolWorkQueue.Dispatch()

Faulting application name: ServiceFabricNodeBootstrapAgent.exe, version: 1.0.0.143, time stamp: 0x58c87254 Faulting module name: KERNELBASE.dll, version: 6.3.9600.18340, time stamp: 0x57366075 Exception code: 0xe0434352 Fault offset: 0x0000000000008a5c Faulting process id: 0x9b0 Faulting application start time: 0x01d29d73912bda98 Faulting application path: C:\Packages\Plugins\Microsoft.Azure.ServiceFabric.ServiceFabricNode\1.0.0.34\Service\ServiceFabricNodeBootstrapAgent.exe Faulting module path: C:\Windows\system32\KERNELBASE.dll Report Id: cf297669-0966-11e7-80c5-000d3a27d68c Faulting package full name: Faulting package-relative application ID:

Restarting the nodes does not help and I have verified that that the certificate gets installed on the VMs. I'm getting no errors logged in the portal, just the "Deploying" message on the cluster. Nodes are Windows Server R2. Any ideas? Obviously the path given to System.IO.Path.GetFullPathInternal is null but what could cause that?

azure-service-fabric
asked on Stack Overflow Mar 15, 2017 by Bobamackara

1 Answer

3

User contributions licensed under CC BY-SA 3.0