Last night I had the pleasure of presenting at Dot Net Notts (@dotnetnotts), a relatively new User Group in Nottingham. Having been born in Nottingham, it was a really nice experience to go back visit some family and then present at this vibrant and welcoming new user group.
Last night I was talking about Telemetry and Logging experiences on a Cloud Platform (namely Microsoft Azure) and was trying to impart the benefit of some experiences running a large multi product platform. Hopefully the talk was well received and everyone enjoyed it. The talk was a different type to the majority I’ve given before and I’ve been wanting to deliver it for a while. The content of the talk discussed mistakes that are easy to make, processes that can help, and some tips along the way for mitigating mistakes and providing a supportable approach to telemetry. Here are some of the tips, conclusions and resources from the talk last night:
Logging Tips:
- Instrument for insight into application
- Capture inter-service application activity and latency
- Ensure level can be altered at run-time
- Abstract logging – gives agility to change framework
- ALWAYS ENABLE LOGGING
Logging Level guides from a talk by Scott Guthrie at NDC 2013
Level | Context |
ERROR | Always on in Production. Any errors will trigger ACTION to resolve (automated or human) - Configuration issues
- Application failures
|
WARNING | Always on in Production. Warnings will INFORM, and may signal potential ACTION - Timeouts or throttling in external service
|
INFO | Always on in Production. Info messages INFORM during diagnostics and troubleshooting |
DEBUG (VERBOSE) | On during active debugging and troubleshooting on a case by case basis |
NDC 2013 – Scott Guthrie: Building Real World Cloud Apps With Windows Azure PT2
Conclusions
- Involve key stakeholders early in the design phase from a product and platform perspective
- May have to tell them what they need
- Consider telemetry needs at early stage of development
- Consider SLA want to provide and how to prove it
- Choose frameworks and services carefully
- Iterate repeatedly – requirements evolve during lifetime of a product/platform
- Telemetry is important – just as important as new product features
Resources
66287927-693c-4f24-a371-e1b796482706|2|4.0|96d5b379-7e1d-4dac-a6ba-1e50db561b04