Andrew's Blog

Random Thoughts of an ASP.Net Code Monkey

Telemetry on Cloud Platforms–Resources

May 28, 2014 03:26 by Andrew Westgarth

Last night I had the pleasure of presenting at Dot Net Notts (@dotnetnotts), a relatively new User Group in Nottingham.  Having been born in Nottingham, it was a really nice experience to go back visit some family and then present at this vibrant and welcoming new user group.

Last night I was talking about Telemetry and Logging experiences on a Cloud Platform (namely Microsoft Azure) and was trying to impart the benefit of some experiences running a large multi product platform.  Hopefully the talk was well received and everyone enjoyed it.  The talk was a different type to the majority I’ve given before and I’ve been wanting to deliver it for a while.  The content of the talk discussed mistakes that are easy to make, processes that can help, and some tips along the way for mitigating mistakes and providing a supportable approach to telemetry.  Here are some of the tips, conclusions and resources from the talk last night:

Logging Tips:

  • Instrument for insight into application
  • Capture inter-service application activity and latency
  • Ensure level can be altered at run-time
  • Abstract logging – gives agility to change framework
  • ALWAYS ENABLE LOGGING

Logging Level guides from a talk by Scott Guthrie at NDC 2013

Level Context

ERROR

Always on in Production. Any errors will trigger ACTION to resolve (automated or human)

  • Configuration issues
  • Application failures

WARNING

Always on in Production. Warnings will INFORM, and may signal potential ACTION

  • Timeouts or throttling in external service

INFO

Always on in Production. Info messages INFORM during diagnostics and troubleshooting

DEBUG (VERBOSE)

On during active debugging and troubleshooting on a case by case basis

NDC 2013 – Scott Guthrie: Building Real World Cloud Apps With Windows Azure PT2

Conclusions

  • Involve key stakeholders early in the design phase from a product and platform perspective
  • May have to tell them what they need
  • Consider telemetry needs at early stage of development
  • Consider SLA want to provide and how to prove it
  • Choose frameworks and services carefully
  • Iterate repeatedly – requirements evolve during lifetime of a product/platform
  • Telemetry is important – just as important as new product features

Resources



MCTS

Post calendar

<<  November 2024  >>
MoTuWeThFrSaSu
28293031123
45678910
11121314151617
18192021222324
2526272829301
2345678

Disclaimer

The opinions expressed herein are my own personal opinions and do not represent my employer's view in anyway.

© Copyright 2024