Unhealthy event: SourceId=’FabricDCA’, Property=’DataCollectionAgent.DiskSpaceAvailable’, HealthState=’Warning’, ConsiderWarningAsError=false. The Data Collection Agent (DCA) does not have enough disk space to operate. Diagnostics information will be left uncollected if this continues to happen.

You will often see this error pretty much right away when your Service Fabric cluster comes up if you are using a VMSS with VMs having smaller temporary disk sizes (d:\).

So what’s going on here?

By default Service Fabric’s reliable collections logs for reliable system services are stored in D:\SvcFab, to verify this you can remote desktop in to one of the VMs in VMSS where this warning is coming from. Most people will only see this warning in primary node type as the services you as a customer create are generally stateless and hence no stateful data logs are present on the non primary node types.

Default log size for replicator log (reliable collections) in MB is 8192 so if your temporary disk is 7GB (Standard_D2_v2) for example you will see the warning message in the cluster explorer as below-

Unhealthy event: SourceId=’FabricDCA’, Property=’DataCollectionAgent.DiskSpaceAvailable’, HealthState=’Warning’, ConsiderWarningAsError=false. The Data Collection Agent (DCA) does not have enough disk space to operate. Diagnostics information will be left uncollected if this continues to happen.

How to fix this?

You can change the default replicator log size by adding the FabricSetting in the ARM template named “KtlLogger” like highlighted below, this file size does not change once configured (grow or shrink)-

“fabricSettings”: [
{
“name”: “Security”,
“parameters”: [
{
“name”: “ClusterProtectionLevel”,
“value”: “EncryptAndSign”
}
]
},
{
“name”: “KtlLogger”,
“parameters”: [
{
“name”: “SharedLogSizeInMB”,
“value”: “4096”
}
]
}
]

 

For VM temporary disk sizes and specs, see here- https://docs.microsoft.com/en-us/azure/virtual-machines/windows/sizes-general

More info around configuring reliable services\manifest is here-

 

Service Fabric to Cloud Services, Why?

These are some interesting benefits of using Service Fabric (SF) (over Cloud Services and in general)-

  1. High Density- Unlike cloud services you can run multiple services on a single VM saving you both cost and management overhead, SF will re-balance or scale out the cluster if resource contention is predicted or occurs.
  2. Any Cloud or Data Center- Service Fabric cluster can be deployed in Azure, on-premise or even in a 3rd party cloud if you need to due unforeseen change in company’s direction or regulatory requirements. It just runs better in Azure, why? Because certain services are provided in Azure as a value addition e.g. upgrade service.
  3. Any OS- Service Fabric cluster can run on both Windows and Linux. In near future, you will be able to have a single cluster running both Windows and Linux workloads in parallel.
  4. Faster Deployments- As you do not create a new VM per service like in cloud services, new services can be deployed to the existing VMs as per the configured placement constraints, making deployments much faster.
  5. Application Concept- In microservices world, multiple services working together forms a business function or an application. SF understands the concept of application than just individual services which constitutes a business function. SF treats and manages application and it’s services as one entity to maintain the health and consistency of the platform, unlike cloud services.
  6. Generic Orchestration Engine- Service Fabric can orchestrate both at process and container level should you need to. One technology to learn to rule them all.
  7. Stateful Services- A managed programming model to develop stateful services following  the OOPs principle of encapsulation i.e. keeping state and operations as a unit. Other container orchestration engines cannot do this. And of course you can develop reliable stateless services as well.
  8. Multi-tenancy- Deploy multiple versions of the same application for multiple clients side by side or do a canary testing.
  9. Rolling Upgrades-  Upgrade both applications and platform without any downtime with a sophisticated rolling upgrade feature set.
  10. Scalable- Scale to hundreds or thousands of VMs if you need to with auto scaling or manual.
  11. Secure- Inter VM encryption, cluster management authentication/authorization (RBAC), network level isolation are just a few ways to secure your cluster in the enterprise grade manner.
  12. Monitoring- Unlike cloud services SF comes with a built in OMS solution which understands the events raised by SF cluster and take appropriate action. Both inproc and out of proc logging is supported.
  13. Resource Manager Deployments– Unlike cloud services which still runs in a classic deployment model, SF cluster and applications uses resource manager way of deployments which are much more flexible and deploys only the artefacts you need.
  14. Pace of Innovation- Cloud services is an old platform, still used by many large organisations for critical workloads but it is not the platform which will get new innovative features in future.

More technical differences are here.