Error message

  • Notice: Trying to access array offset on value of type int in element_children() (line 6543 of /home1/dewey/drupal/includes/common.inc).
  • Notice: Trying to access array offset on value of type int in element_children() (line 6543 of /home1/dewey/drupal/includes/common.inc).
  • Notice: Trying to access array offset on value of type int in element_children() (line 6543 of /home1/dewey/drupal/includes/common.inc).
  • Deprecated function: implode(): Passing glue string after array is deprecated. Swap the parameters in drupal_get_feeds() (line 394 of /home1/dewey/drupal/includes/common.inc).
  • Deprecated function: The each() function is deprecated. This message will be suppressed on further calls in menu_set_active_trail() (line 2404 of /home1/dewey/drupal/includes/menu.inc).

Thoughts on running software

I'm partially on operations guy, so, net result, I'm responsible making software run. What exactly does this mean?

Here's the minimum set:

  • RUN: Arrange for it to run
  • MONITOR: How do I find out if it stops running?
  • PUBLISH: Make it available to the people who need it
  • ADVERTISE: And make sure those people know about it.

Other things I might want to do depend on the quality of job I want to do (a.k.a. "Service Level Agreement" or SLA)

  • Durable: ensure it keeps running after maintenance and transient problems (i.e. survive a reboot, recovers from a network outage, etc.)
  • Repeatable: Make it something I can easily set up again.
  • Recoverable: If something goes wrong, make sure I can get it back to where it was (this implies backup strategy)
  • Resilient: "DR" -- be able to keep providing services in the event of critical failures.

These lists are in rough order of importance (but sometimes things vary).

In the desktop world, the first part was mostly "slap it on a CD and they'll tell you if it stops working" coupled with "how do we sell this thing?". In a SaaS environment this is somewhat different: monitoring becomes more important (for large scale services, no one is going to call support but they'll get on social media and tell people it's not working). Also, the "it depends" aspects start becoming interesting. For any significant service, durability and resiliency become important. Repeatability becomes important for upgrades and development.

A lot of people believe that if it's resilient it doesn't need to be recoverable. Perhaps true, perhaps not -- I've easily lost an order of magnitude more data to user mistakes than I have to systems failures, so even resilient systems might need to be recoverable.

Ultimately, what you do with each pieces depends on your requirements, but it's worth considering all of them.

Ready for Validation

Add new comment

Filtered HTML

  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <blockquote> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.