|
Unbounded Environments
Today, bounded environments ensconced within
clearly demarcated perimeters are giving way to
a milieu where gateways have been rendered obsolete.
In this environment, the distinction between insiders
and outsiders has blurred, and organisations neither
possess central administrative control over their
information systems nor do they have access to
a global view of events occurring therein. In
such an environment, it is virtually unfeasible
to thwart cyber attacks. Traditional information
security models are ineffective when confronted
with the security problems associated with open-ended
environments.
Since no system is entirely impervious to attacks
in an unbounded environment, there is now an intense
focus on ensuring survivability of mission critical
systems and essential services, despite the presence
of cyber-attacks. Emerging technologies such as
grid computing and web services, make unbounded
environments even more vulnerable, mandating the
need to build capabilities into systems such that
they have the resilience to survive an attack
and continue to fulfil their business mission
in a timely manner. The 'survive' philosophy of
modern information security is a paradigm shift
from the 'prevent' viewpoint of traditional security
models.
Survivability and availability focus on preserving
essential services in unbounded environments,
even when systems in such environments are penetrated
and compromised.
Survivability
Definition Survivability is defined
as the capability of a system to fulfil its mission,
in a timely manner, in the presence of attacks,
failures, or accidents.
| In this definition: |
| :: |
System includes
networks and large-scale systems |
| :: |
Mission refers to a set of very
high-level or abstract goals of an organisation |
| :: |
Timeliness
is of such criticality that it is included
explicitly in the definition |
| :: |
Attacks are potentially damaging
events orchestrated by an intelligent adversary |
| :: |
Failures are potentially
damaging events caused by deficiencies in
the system or in an external element on
which the system depends
|
| :: |
Accidents describe a broad
range of randomly occurring and potentially
damaging events such as natural disasters
|
Characteristics of Survivable
Systems Identification and protection
of essential services is a vital ingredient of
a practical approach to building and analysing
survivable systems. Maintenance of essential properties
is central to the delivery of essential services.
| :: |
Essential services are
defined as those functions of the system
that must be maintained when the environment
is hostile, or when failures or accidents
occur that threaten the system
|
| :: |
Essential properties include
specified levels of integrity, confidentiality,
performance, and other quality attributes
|
Key to the concept of survivability
is the identification of essential services, and
the essential properties that support them, within
an operational system. The overall function of
a system should adapt to preserve essential services.
Thus, the capability of a survivable system to
fulfil its mission in a timely manner is linked
to its ability to deliver essential services in
the presence of an attack, accident, or failure.
To deliver essential services, survivable systems
should have the following vital characteristics:
| :: |
Resistance
to attacks |
| :: |
Recognition of attacks and the
extent of damage |
| :: |
Recovery of
full and essential services after attack |
| :: |
Adaptation and evolution to
reduce effectiveness of future attacks |
Developing Survivability
Solutions Survivability solutions are
risk management strategies that primarily depend
on an intimate knowledge of the mission being
protected. The focus on the mission results in
the extension of survivability solutions beyond
purely independent technical solutions.
| :: |
Creating strategies
Firstly, risk mitigation strategies must
be created in the context of a mission's
requirements, which are prioritised sets
of normal and stress requirements. They
must be based on "what-if" analyses
of survival scenarios and contingency planning.
|
| :: |
Forecasting scenarios
Survival scenarios positing a wide range
of cyber attacks, accidents, and failures
assist in the analyses and contingency planning.
These scenarios focus on adverse effects
rather than causes. Effects are also of
more immediate situational importance than
causes, because an organization will likely
have to deal with and survive an adverse
effect long before a determination is made
as to whether the cause was an attack, an
accident, or a failure.
|
| :: |
Planning
Contingency and disaster planning requires
that risk management decisions and economic
tradeoffs be made by executive management,
with guidance from technical experts in
the application domain, computer security,
and other software engineering and related
disciplines.
|
Survivability depends equally
upon the risk management skills of an organization
and upon the technical expertise of information
security experts. This is certainly appropriate
from an organizational perspective, because business
risk management is a primary responsibility of
executive management, and not the role of information
security experts. The role of the experts in security
is to provide executive management with the information
necessary to make informed risk-management decisions.
Thus, the preparatory steps necessary for survivability
must be taken by an organization as a whole, rather
than by security experts alone.
Trends in Survivability Solutions New
research methods and tools are under development
to support survivability solutions encompass the
following approaches:
| :: |
Designating a portion of
the infrastructure as the essential minimum
and harden that portion against attacks
|
| :: |
Making the requirements
for survivability explicit, identifying
functionality whose absence currently prevents
adequate satisfaction of those requirements
|
| :: |
Exploring techniques for
designing and developing highly survivable
systems, despite the presence of untrustworthy
subsystems and untrustworthy participants
|
| :: |
Recommending specific architectural
structures that can lead to survivable systems
and networks capable of either preventing
or tolerating a wide range of threats
|
| :: |
Taking an adaptive control
systems perspective on survivability, which
can continue to provide control of a system
in the face of disruption to elements of
the system and control system
|
| :: |
Examining survivability requirements
for real-time command and control systems |
Availability
Definition Availability is defined
as a disciplined methodology encompassing the
entire IT infrastructure to ensure guaranteed,
consistent and predictable access to any component
of the infrastructure.
In this definition, it should be noted that Availability
guarantees that business systems continue to provide
acceptable levels of performance under normal
as well as under unexpected events and circumstances.
| :: |
Unexpected events
Incidents of lost data, system failures
or unforeseen contingencies are often termed
as 'unexpected events' causing disruptions
to businesses. Over a reasonable period
of time, every organisation will experience
something unpredictable that shuts down
one or more systems. It is not just likely,
but it is inescapable! Only the timing and
precise nature of unplanned downtime are
unanticipated.
|
Characteristics of Availability
| :: |
Availability is a critical
business requirement to ensure the accessibility
of information resources as and when needed.
Indeed, availability of information is so
critical that it forms one leg of the 'CIA'
(Confidentiality-Integrity-Availability)
triad, which is the foundation of information
security.
|
| :: |
Availability is not a product
or service, but a set of practices utilized
to ensure appropriate levels of access to
data and applications. It requires detailed
planning, meticulous implementation and
periodic review and implementation.
|
| :: |
The 'availability' of an
information system is measured not only
by the ability of the system design and
implementation to satisfy required functionality,
but also by the adequacy of redundancy in
terms of system hardware, software and procedures
built into the system to safeguard against
potential disruptions.
|
| :: |
Managing availability stretches
beyond analysing potential hardware failures.
It involves managing the whole environment
(data, applications, servers, operating
systems, middleware, etc.) to ensure users
can access data and applications as and
when, how and where they need them. Ensuring
availability entails facilitating consistent
and predictable access to information resources.
|
| :: |
Ensuring availability does
not always imply keeping systems accessible
all the time. For instance, a business may
not attach critical importance for system
downtime during non-business hours. However,
what it implies is that critical information
must be accessible during pre-specified
critical hours, which could mean 24 hours
a day for e-Businesses and Global Corporations.
|
Availability and Reliability
Often, the terms Availability and Reliability
are equated erroneously. Reliability refers to
the Mean Time Between Failures (MTBF) of a hardware
component. Reliability is therefore a component
of availability. However, with the increasing
quality of hardware equipment, reliability is
less of a concern to most IT managers.
Developing Availability Solutions
The availability of Information Resources can
be ensured by employing an appropriate combination
of tools, services and processes. The vulnerabilities
relating to availability in all data, applications,
sites, communication links, etc. should be analysed;
the threats to the individual components should
be evaluated; and the potential downtime cost
should be assessed before a solution is actually
deployed.
A generic strategy for deploying an availability
solution encompasses the following:
| :: |
Environmental Assessment
The first step is to assess the existing
environment and determine the availability
requirements. For each IT component under
consideration, the assessment should:
:: Determine the
availability needs and concerns
:: Determine the existing recovery
levels based on the current environment
:: Assess what changes can be
implemented to gain additional levels
of recovery
:: Gain technical acceptance
from the entire IT department |
|
| :: |
Planning Services
After the existing environment has been
analyzed and the availability requirements
assessed, the next step is to develop a
plan to implement a high availability solution.
The plan must define the:
:: Project Objectives
:: Project Team Members
:: Deliverables Timetable
:: Required tasks
:: The required IT infrastructure
changes |
|
| :: |
Education & Training
Human error is a primary cause of unplanned
downtimes. A formal and periodic end-user
and system operator training program on
the use and maintenance of Information Systems
can significantly reduce instances of unplanned
system downtimes. Similarly, a documented
and tested recovery plan can significantly
reduce the duration of any outages.
|
| :: |
Availability Architecture
From the assessment and implementation processes
outlined above flows the architecture that
requires to be put in place to assure organizational
availability of IT infrastructure. The availability
architecture would include the following
components/implementations: backup &
restore, third party recovery sites, journaling,
data vaulting, commitment control, uninterruptible
power supplies, fault tolerant hardware,
raid, disk mirroring, non-clustered multiple
systems, clusters, alternate communication
paths, heterogeneous replication, auditing
services etc.
|
Conclusion
The natural intensification of offensive threats
versus defensive countermeasures has demonstrated
time and again that no practical systems can be
built that are invulnerable to attack. Despite
the industry's best efforts, there can be no assurance
that systems will not be breached.
Thus, the traditional view of information systems
security must be expanded to encompass the specification
and design of systems that assure availability
and survivability in spite of attacks. Only then
can systems be created that are robust in the
presence of attacks and are able to survive attacks
that cannot be completely repelled, assuring the
organisation availability of mission critical
system.
|