Ref Oreilly Overcoming-It-Complexity
Ref Oreilly Overcoming-It-Complexity
My
Ba
nk
Tr
an
sf
er
Co
Ne
m
xt
pl
et
Tr
e
an
sf
er
Overcoming IT
Complexity
Simplify Operations, Enable Innovation, and
Cultivate Successful Cloud Outcomes
Lee Atchison
with contributions from Mark Menger
Overcoming IT Complexity
by Lee Atchison
Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472.
O’Reilly books may be purchased for educational, business, or sales promotional use. Online
editions are also available for most titles (http://oreilly.com). For more information, contact our
corporate/institutional sales department: 800-998-9938 or [email protected].
The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. Overcoming IT Complexity,
the cover image, and related trade dress are trademarks of O’Reilly Media, Inc.
The views expressed in this work are those of the author and do not represent the publisher’s
views. While the publisher and the author have used good faith efforts to ensure that the
information and instructions contained in this work are accurate, the publisher and the author
disclaim all responsibility for errors or omissions, including without limitation responsibility
for damages resulting from the use of or reliance on this work. Use of the information and
instructions contained in this work is at your own risk. If any code samples or other technology
this work contains or describes is subject to open source licenses or the intellectual property
rights of others, it is your responsibility to ensure that your use thereof complies with such
licenses and/or rights.
978-1-492-09843-0
[FILL IN]
Contents
v
| 1
7
8 | OVERCOMING IT COMPLEXITY
There are several measures that businesses can take to mitigate the effects
of the IT Complexity Dilemma, including investing in automation technologies,
establishing standard operating procedures, and hiring skilled IT professionals.
By taking these steps, businesses can improve their ability to manage and opti-
mize their IT operations, minimizing the negative impacts of complexity.
However, even with these measures in place, businesses will still face chal-
lenges in managing their IT operations due to the increasing complexity of
modern IT systems. By attempting to mitigate the effects of this complexity,
businesses can improve their efficiency, security, and competitiveness in today’s
increasingly complex IT environment.
As we will discuss later in this chapter, the core to the complexity dilemma is
this technical debt. In fact, technical debt and complexity go hand-in-hand. We’ll
learn that, beneath the surface, technical debt is much more than needed code
refactorings.
also detect and resolve problems when they occur. A close, working relationship
between development and operations is critical to this speed.
However, natural separations start taking shape as applications grow in
complexity and the organization grows in size. Traditional divisions between
development and operations begin to formalize, and the space between the two
groups grows and expands.
A flat management structure has been able in the past to assist the orga-
nization in keeping the communications channels flowing as much as possi-
ble—flowing between development, operations, security, and product leadership
teams. But, as the organization grows, keeping the organization flat and respon-
sive becomes harder and harder.
Management and organizational structures are required to keep the growing
organization operational. Formalized processes help yield consistent results and
plans. Yet, these same structures and processes create a natural blockage to
communications flow. This blockage makes
it harder for the organization to function. Teams split, and the organizational
distancing limits cooperation and communications. This limits growth.
Ironically, the biggest inhibiter to growth is, in fact, growth.
The organizational structure gets more complex, and the application gets
more complex.
Since an IT organization is only as good as the management that drives it, it
is essential to have a strong and effective management team in place. This team
is responsible for steering the organization in the right direction, setting goals
and objectives, and ensuring that all aspects are running smoothly.
The management team must also adapt to changes quickly and effectively
respond to new demands in the marketplace. They must work across organiza-
tional boundaries, and operate in unison with all product, development, opera-
tional, and support teams.
How the IT organization is structured varies considerably based on the
nature of the business. The type of company is the biggest indicator of the
structure of the IT organization. And nothing drives the organization more
significantly than where and how the software development teams are organized.
10 | OVERCOMING IT COMPLEXITY
Let’s look at each of the three types individually and their characteristics.
Then we will look at how the IT and software development organizations look
different inside each type of company. In the end, you will find that your circum-
stances lead to an amalgam of two or three of these types.
and maintains only the services they are responsible for. Operations teams pro-
vide a smooth operations infrastructure that supports the development teams.
The focus of the enterprise, typically, is on the software development teams.
These companies may have separate IT organizations to support business pro-
cesses, but the application development teams are not part of that organization.
Take a look at the example in Figure 1-2. Non-SaaS software companies have
the same focus on application development teams as SaaS focused organizations,
but these teams are typically not DevOps teams. They are independent software
development teams that produce software sold and delivered to customers. The
IT organization is small and provides support to the company as a whole, includ-
ing development and operations of tools for the company. The IT organization
is separate and isolated from the primary, mainstream product development
software teams.
software to run their business, but their primary business is not software. This
is almost all non-technology focused companies, including banks, restaurants,
stores, taxi services, airlines, railroads, media companies, etc.
Notice that some companies do fit multiple categories. For instance, Micro-
soft offers SaaS services (Office 365) and non-SaaS software (Microsoft Word and
Halo). Additionally, a company such as Charles Schwab may offer investment
software as a service, yet they also focus on general financial services and invest-
ments. These companies may have different divisions that appear to be separate
companies, each structured differently. Or they may have a hybrid structure.
Keep in mind that these categories are generalizations.
The mission of a non-software company is not technology focused. They may
use software as a tool internally to manage the sales, marketing, manufacturing,
or other business processes, but software is secondary to their primary business.
As such, there are no large application development teams. There is an
IT organization, and that organization will have a relatively small development
and operations team within the IT organization itself. Calling the organization
“small” is in relation to the company itself. Only a small portion of the compa-
ny’s resources are invested in IT systems and personnel.
Figure 1-3 illustrates this. The company leadership has bigger things to focus
on, leaving IT leadership to manage these small development and operations
teams.
WHAT IS THE MODERN IT COMPLEXITY DILEMMA? | 15
architects, and software leaders tend to gravitate towards the much more lucra-
tive opportunities in SaaS application development, and other software-centric
companies.
This means that organizations where software plays only a secondary role in
the company’s mission find it difficult to attract and retain software talent. Often
this means the organization, as a whole, suffers. Yes, there are high quality,
talented developers in these organizations, but they are much harder to locate
and hire, and hence tend to be less available to a typical non-development-centric
organization. This tends to create less innovation and fewer creative solutions to
problems in these types of organizations. Rather than state of the art software
applications, the applications created in such organizations tend to be supportive
applications that lack a high degree of innovation.
The caliber of your IT development teams is critically important in determin-
ing the sophistication of applications your organization can support, and your
organization’s ability to respond to the increase in complexity that occurs over
time.
The result is, organizations where software is a secondary part of the busi-
ness rather than a primary part tend to be organizations that are more sensitive
to the IT complexity dilemma.
these organizations. Organizations that are less operations focused need less
investment in this area, and hence don’t attract as much interest.
Traditional operations, however, is changing. Many traditional operational
capabilities are handled by outsourced infrastructure, such as SaaS applications
and cloud service providers. Additionally, newer tooling and capabilities auto-
mate a large portion of basic operational needs.
Tools such as Infrastructure as Code (IaC) and Operations as Code (OaC) help
with this automation, and strive to make operational setup and basic operational
responsibilities automated and repeatable. This improves overall operational reli-
ability. Additionally, since Scripts and script-like descriptions drive iaC and OaC,
these capabilities encourage code as documentation and knowledge sharing of
the operational environments involved. Finally, since IaC and OaC generalize the
operational aspects of an application into code-like capabilities, they allow the
use of standardized and well understood development processes, such as revision
management. Revision management allows tracking and correlating failures to
changes, creating a better operating team, reducing mistakes, increasing security
and traceability, and improving overall accountability.
thin operations support organization, but the job of this organization is not to
manage the operations of the application; rather, their job is more of a tools and
infrastructure team. They provide tools and assistance to the product teams that
own and operate their individual services.
It’s hard to point exactly when complexity begins within a young startup IT
organization, but it’s usually tied to some decision that was meant to reduce time
to market or cost to market, at the expense of some further work or cost later on.
This starts the slow and inexorable climb in complexity. For larger, more estab-
lished enterprises it’s undoubtedly tied to the incorporation of technology into
the established enterprise’s processes. In either case, the increase in complexity
is tied to the increase in technical debt. So the advent of IT complexity is driven
by the collection of technical debt within the organization.
Let your financial debt grow too large, and you will go broke. Let your
technical debt grow too large, and your application will become unsupportable
and unsustainable.
How does technical debt grow? Technical debt can grow naturally and quietly
during the normal product development process. Every project that contributes
to a product, also contributes to its technical debt. This is illustrated with the top
box in Figure 1-5. During the normal product development, work and output is
added to the product, as well as some amount of debt to the stack of technical
debt
Sometimes, a project is done “quick and dirty”, such as when a new feature
is added without proper design in order to get it out the door quicker. In these
cases, the project adds more debt. Sometimes, the project can even add more
technical debt than useful capabilities. This is the example project shown in the
middle box in Figure 1-5. More technical debt is added to your application than
the amount of real value the project provided.
To keep technical debt from growing without bounds, some effort needs to
be added to each project to reduce the technical debt. As shown in the bottom
box in Figure 1-5, keeping your technical debt at a sustainable level requires
constant investment in reducing the technical debt over time.
This constant flow of increasing and decreasing technical debt is one of
the reasons why it can sneak up on a product. If more debt is regularly added
WHAT IS THE MODERN IT COMPLEXITY DILEMMA? | 21
than is reduced from the backlog, the debt will grow, yet the growth may not be
noticeable. It’s not until the debt has grown to a point where it starts having a
negative impact on your product that you notice its size. At this point, it may be
too large to deal with effectively and easily.
Each project can either increase or decrease the technical debt within a
system. During a full, high quality project, the planned work often includes
doing all the work necessary, along with working on reducing some amount of
related technical debt. When the work is completed, the technical debt for the
application is lower than it was before. This is illustrated in Figure 1-6. The work
completed for the project is larger than the project itself, and the extra effort is
towards reducing the size of the technical debt. This is a project that’s dealing
with technical debt in a healthy way.
Unfortunately, many projects are much more quick and dirty. They are
designed to only complete as much of the project as is absolutely necessary, leav-
ing the rest of the project as work that will be completed later. In fact, a common
project management philosophy involves building an MVP — Minimum Viable
Product, essentially dictating that you do as little product work as possible to get a
functional product out the door.
The result is work that is not completed. More often than not, this increases
the overall technical debt of the application. This phenomenon is illustrated in
Figure 1-7. Here, the work completed is only part of the total project. We have left
some amount of work not done out of the project. This additional work which
22 | OVERCOMING IT COMPLEXITY
was not completed, ends up increasing the overall technical debt remaining in
the project.
Figure 1-7. A quick & dirty project often increases technical debt
In any case, as projects are executed, the amount of technical debt can vary
over time, sometimes decreasing, sometimes increasing. The more full, high
quality projects that are completed, the lower the overall technical debt. The more
quick and dirty projects used to implement functionality, the higher the resulting
technical debt. The types of projects you execute will, over time, vary the total
technical debt you have in your application.
The Negative Impact of Technical Debt Sometimes deciding to build a simpler
solution now, in favor of delaying longer term implementation is advantageous
(this is the Figure 1-7 situation). It allows you to get a solution out to customers
earlier, which allows the company to start monetizing the change, and receive
customer input on capabilities the customer likes and does not like, which can
be fed into a later, more ambitious solution. This is analogous to saying that
borrowing money is advantageous if you use the money to contribute to a greater
cause, such as purchasing a home. Paying some interest on borrowed money
is fine, as long as the money you borrowed is put to good use. So too, with
technical debt, managing some technical debt is useful and appropriate as long
as you give value to your product and your company. Technical debt becomes
a problem when it is left unresolved—unresolved technical debt ages over time
and increases in cost.
Using the financial metaphor, technical debt becomes a problem when it
builds up so that the cost of servicing the debt is too great, and it impedes your
WHAT IS THE MODERN IT COMPLEXITY DILEMMA? | 23
ability to invest in future projects. So, too, technical debt becomes a problem
when managing and servicing that debt is overwhelming compared to managing
and servicing the product.
When too many quick and dirty projects rule your project plan, and projects
designed to reduce debt are not staffed in your company, your debt starts to
become unmanageable. If this goes on for too long, technical debt overwhelms
the project, as shown in Figure 1-8.
In this scenario, servicing the debt becomes the dominant role of your team,
and you spend little or no time contributing to improving the product. Your debt
is too large to be effectively managed.
upon success, while the brittle application is ready to roll away off the top of the
hill into failure at the smallest of nudges.
diluted, less accurate, higher level, or more specialized. Broad, general purpose,
but detailed knowledge of the application as a whole is no longer possible by
single individuals.
The knowledge that engineers do have on the application becomes obsolete
quicker
Complex applications change frequently, and engineers’ knowledge about
how the application works gets outdated quicker.
It gets harder to bring new engineers up to speed
Complex applications have long learning paths. This is not only because
there is more to learn. The knowledge needed for new engineers to become
productive is more distributed, anecdotal, and out of date.
The net result of these issues is higher organizational pain. This pain trans-
lates into poor quality changes, less motivated staff, and ultimate staff turnover.
Higher turnover means greater need to train new engineers, which is harder as
the pain increases. Brittleness leads to lower availability, and customer-visible
issues and failures.
This is the pain of complexity.
Messy Desk Syndrome Imagine a perfectly clean desk. Now, take a sheet of
paper and set it in the corner. Is your desk messy? No, not yet. Now you take
fifty other sheets of paper that go together and sit them on top of the one sheet
in the corner. Is your desk messy? No, not yet. Now imagine more sheets, but
these don’t go with the stack in the corner, they are for a different project. So you
put them in different locations on the desk, just single sheets in single locations,
seemingly in a location that makes sense. Now put more papers and documents
and books and folders and pictures one at a time all over the desk. If you don’t
know where something goes, just put it in a new location. You’ll figure it out
later. Sooner or later, your desk is messy. In fact, it’s extremely messy.
Unless you have a solid organizational plan for organizing the papers on
your desk established at the beginning, and stick with it, sooner or later your desk
will become messy, one sheet at a time.
Your desk becomes messy because you didn’t have a plan from the begin-
ning, but just “winged it” along the way. You made your desk messy simply by
using and working on it.
So too, your organizational pain becomes large because you didn’t have an
architectural plan from the beginning. Rather, you started with no plan and
adjusted and changed the plan as time went along. You “winged it”, metaphori-
26 | OVERCOMING IT COMPLEXITY
cally. You have added technical debt, and hence organizational pain, simply by
working and building the application.
Every action you take, little by little, builds up. Your technical debt grows a
bit at a time until it becomes overwhelming.
• “Let’s change our login process to allow saving login credentials in the
user’s browser”
• “Let’s add this new feature to that menu”
• “Users would rather this feature work in three steps rather than the cur-
rent four steps. Let’s combine two of the steps.”
• “We need to remove the per-session limit on this resource”
• “We don’t have time to build this full feature now, but we can build this
smaller feature, which will make many customers happier. We can do the
rest later.”
• “Let’s release this feature this way first, and then we can collect input from
customers and modify it to make it more user friendly as we get more
input”
Any one of these statements can correspond to a simple set of changes that
makes perfect sense at the time. It might not have any obvious impact on overall
technical debt at all.
But the little changes…and the little debt…and the little impact…and the little
piece of paper on the corner of your desk…adds up. And like the messy desk,
each action may individually seem perfectly benign. Actions may look perfectly
acceptable. But, when combined, they multiply and become overwhelming.
Complexity in an IT Organization
Complexity grows in IT organizations as well. Complexity starts by growing
within your application. As your application grows complex, so does the infra-
structure needed to run the application. Your IT operations become more com-
plex. Your engineering organization becomes more complex. To wrap their
minds around all of this, your IT management gets more complex.
What started as a simple increase in the needs of your application, has
changed into the growth of a complex IT organization.
An organization that was once agile tends to change and migrate over time.
It changes into either a robust or rigid organization—one that is afraid of and
rejects change in order to keep the system stable and supported, or it changes
into a fragile organization—an organization where every minor change risks
breaking a larger system or process, limiting the ability to adjust and grow. This
is illustrated in Figure 1-10.
Figure 1-10. An Agile organization fails over time either by becoming rigid, or fragile.
28 | OVERCOMING IT COMPLEXITY
IT Death
So, technical debt leads to complexity, and complexity leads to organizational
pain. This all ultimately leads to IT death.
But what does IT death look like?
IT death is what happens to an organization when the pain of complexity
sends the organization into a state of ineffectualness. It cannot improve, it cannot
grow, and hence it stagnates. Since competitors will continue to grow, an organi-
zation’s stagnation ultimately leads to its death.
You see it in many organizations.
Xerox, long the leader in copiers for larger organizations, suffered from the
inability to pivot from copiers to the personal computer. Despite the fact that
Xerox PARC originally conceived the modern personal computer user interface,1
they were unable to compete with Microsoft and Apple for the personal computer
operating system. Arguably, without Xerox PARC, there would be no Apple Mac-
intosh computer, yet Xerox’s inability to pivot kept them from this innovation.
It’s not just technology companies that suffer this fate. Firestone, the tire
company, was facing the difficult task of modernizing its tire creation process
in light of radial tire technology created by one of its competitors, Michelin.
Firestone bogged down and could not update its processes to handle the new
technology. Try as it might, it kept making tires that customers did not want, and
their business suffered. Ultimately, Firestone was absorbed by Bridgestone. This
is an example of what the Harvard Business Review2 calls Active Inertia.
Many other originally highly innovative companies fall into the trap of IT
death by losing their ability to innovate. Hewlett Packard, one of the founding
companies of Silicon Valley—the heart of technical innovation across the world
—found its lack of innovation lead to a slow death spiral.
And let’s not talk about the innovation failure of Polaroid, which couldn’t
innovate new camera technology; or Blockbuster Video, which failed to embrace
the importance of video streaming technology.
And Borders book store, which was overwhelmed by the innovation of the
upstart Amazon.com.
Technical debt and complexity slow down innovation. They keep companies
from staying competitive, and ultimately this results in their eventual downfall,
and potentially even death.
30 | OVERCOMING IT COMPLEXITY
Summary
Hence, the IT complexity dilemma. IT agility is critical to building a successful
company, yet the very success itself adds technical debt and complexity, and this
complexity leads to either rigidity, or fragility. In either case, ultimately, competi-
tors will outpace the organization in innovation, and the organization dies. Long
term success for a company means managing the IT complexity dilemma.
1 Xerox’s Palo Alto Research Center.
2 Why Good Companies Go Bad, Harvard Business Review.
About the Author
Lee Atchison is a recognized industry thought leader in cloud computing, and
the author of the best selling book Architecting for Scale, published by O’Reilly
Media, currently in its second edition. Lee has 34 years of industry experience,
including eight years at New Relic and sever years at Amazon.com and AWS,
where he led the creation of the company’s first software download store, created
AWS Elastic Beanstalk, and managed the migration of Amazon’s retail platform
to a new service-based architecture. Lee has consulted with leading organizations
on how to modernize their application architectures and transform their organi-
zations at scale. Lee is an industry expert and is widely quoted in publications
such as InfoWorld, Diginomica, IT Brief, Programmable Web, CIO Review and
DZone. He has been a featured speaker at events across the glove from London
to Sydney, Tokyo to Paris, and all over North America. LinkedIn profile: https://
www.linkedin.com/in/leeatchison.