|
purebill.com Stephen Jones writing on billing and application migration |
![]() |
| . | Home | . | About | . | Archive | . | Links | . | Billing | . | Reference | . | Subscribe | . | Search | . | . |
Column - 29 August 2008 Applications fail - design for ease of recoverySummaryWhilst infrastructure designs allow for hardware failure (employing redundancy, failover and a range of other techniques), the software infrastructure equivalent bakes 'ease of recovery' into the application's initial design. When a software problem occurs impacting the application's core data and related tasks, an application design that helps the support staff limit the problem's scope, identify its impact and resume normal processing will pay off. With applications ranging from the transactional (websites) through to more batch driven processing (billing), a software failure will look different in different contexts and the scope of the failure's impact will also vary. Impacts will vary from an account in error being placed in suspense through to a core business application halted until a resolution is made. Failure at each step in the processing chain needs to be considered for its impact on other work being performed, and how its particular recovery resolution would be performed. Any recovery path identified as requiring the entire application be halted needs special design attention since it suggests a core application process, and once the fix has been commenced, no business activity can be performed until the fix has completed. Questions that can guide a recovery design review include:
Modest recovery times by just the application's support staff employing tools used on a daily basis suggest the application's design supports the recovery goal well. Better to consider the recovery approach in a calm measured way up front where the application's design can be changed and tested if required, than to perform the thinking with the application broken, crucial data lost and possibly unrecoverable, and the business owner asking questions... Note: This column was first posted by myself on the site 97 Things Every Software Architect Should Know website under a Creative Commons Attribution 3 license. Tags: Production Support, Outage, Impact [ Share with others ] Post this page to a social bookmarking site:
Other 'purebill' columnsPrevious column: Managing outage processing through alternate landing zones Next column: Provide operational statistics to business users and support staff All previous purebill columns can be found in the archive section. Recent Updates
Sign up to receive a brief text email when a new purebill column is published. JUMP TO TOP
|
. |
| Comments welcome: stephenjones(at)purebill.com | Stephen Jones © 2004-2010 - Copyright and reprint rules | Sitemap | . |