Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
14 views22 pages

Techaudit

Uploaded by

kirankn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views22 pages

Techaudit

Uploaded by

kirankn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

AUDIT

REPORT
Prepared by Oshri Cohen, Fractional-CTO

<ACME> Software
<ACME> Software is a successful technology business that has seen rapid
growth in the last few years and needs help meeting its delivery goals. In this
audit, we evaluated the software engineering, project, technology, and
product management functions. This audit aims to identify the root cause of
inefficiencies and recommend high-level changes that should be
implemented to resolve each issue.
]\

Auditor Statement
The auditor, Oshri Cohen, met with several individuals responsible for
<ACME>'s operational and technological functions. It is this auditor’s
professional opinion that the team is motivated to move the company
forward and improve operations; however, there was a slight defeatism with
some that needs to be addressed through a reorganization of processes and
authority.

The audit did not delve into the codebase as it aimed to identify the root
cause of operational inefficiencies. The codebase itself is very problematic and
will need heavy investment to resolve, and that alone will not solve the many
problems at <ACME>.

The primary business function that the audit focused on was configuration
and consulting operations, which account for a significant portion of revenues
and also experience consistent friction and inefficiencies. Despite that, the
team operates very well and is fully aware of its problems, however, they lack a
vision of what the solution may be.

It was almost immediately clear that the technology and engineering at


<ACME> had evolved to meet the business's needs but were never explicitly
designed to do so, moving from one state to the other based on needs. This
has contributed to a tremendous amount of technical debt and inefficiencies
that have forced businesses to accommodate rather than be explicitly fixed.

<ACME> is running in an aging engineering process that has been further


exacerbated through an extreme interpretation of SOC2 that has the

1
]\

deployment of the product separated from developers. This is not necessary


and there are controls that can be put in place to remain SOC2 compliant.

All in all, nothing happening at <ACME> is out of the ordinary for the
technology business in its current stage of growth, but it must be rectified as
it will only get worse with every new client, at some point, hiring more people
will have a negative effect.

Modern development practices have been explicitly designed to simplify


development with lower-cost employees, simplify deployments almost as an
afterthought, protect the stability of the business from even the most senior
employee leaving, and ensure high-quality deliverables every time. All in all,
these processes reduce the management effort through systemized
processes that cannot be circumvented by any human without oversight.

2
]\

Audit Outcome
The following are the major problems that were identified through the audit.
For each problem, the audit presents the OKR (Objectives and Key Results)
that, if met, will resolve the issues. The exact action plan is not included in this
report because an in-depth analysis is required to produce one with any
degree of confidence. Furthermore, all action plans must involve <ACME>’s
people, who know better than anyone else.

1. The product is not designed as a platform.


This problem is a root cause of project management issues, and lower
profitability across the board at <ACME>. The product is sold as a platform,
and its capabilities to be customized are touted by sales and marketing as a
competitive advantage, but it is not designed as a platform and is missing
many key features.

A platform is defined as a system that allows for custom development and


configuration; what is missing from the core product is the ability to develop
and test changes without having to deploy the source code. The product that
is closest in characteristics is either Salesforce or Microsoft Dynamics; two
systems were designed for hyper customizability using low-cost resources.

The current development model for configurations is to create them in a


demo/dev/test environment, export them out, and send the package to IT for
deployment once a week. Furthermore, anything that the core product can’t
do is achieved as a custom module, which is fine if it were designed to be

3
]\

deployed as part of a platform rather than as a side-by-side copy of the code


in a production environment. At close to 100 deployments a week, this is a
very complicated process to do manually. This auditor is impressed by the low
volume of failed deployments based on what has been disclosed.

OKR (Objectives and Key Results)

Objective

Transform the product into a fully mature platform

Key Results

1. Zero involvement with IT for deployments beyond the initial setup


2. The feedback loop on deployments is down to minutes.
3. Deployment multiple times per day without IT
4. Over-the-air deployments (No human involvement)
5. Continuously improve the core product to simplify configurations

Actions

1. Reduce feedback loop on deployments to minutes


a. Allow deployments without involving IT staff
b. Allow deployments as a native feature
c. Allow revert of deployments in a non-destructive way
d. Allow deployments in a non-destructive way
e. Allow deployments without copying any file at all
2. Reduce the need for development staff for configurations down to only
extreme solutions by 80% (maintain the same price but have the cost
significantly lower)
a. Implement a development environment supported within the
product directly rather than deployed manually through the
“Push Tool.”

4
]\

b. Eliminate the need for Stored Procedures and replace them with
direct SQL wrapped in dotnet classes/interfaces. Stored
procedures are hard to deploy, require advanced permissions, and
are not backward compatible in case of a revert, increasing the
risk to deployments.
c. Implement the ability to test configurations using “test data” by
the customizer directly rather than a QA person.
d. All configurations and custom modules must be stored directly in
the database rather than as files.
e. Implement a “custom entity” concept that allows for custom
tables and data points to be stored in the DB without having to
migrate the database by IT staff.

5
]\

2. No Technical Organization
<ACME> has a development team, some product management, customer
support, and a configuration team, each with a leader, but it needs a technical
organization that operates as a single unit. In modern technology companies,
a technical organization has product management, support, engineering, IT
operations, customer implementations, and technical operations organized
by the CTO. A healthy technical organization can operate without a CTO as
every team is accountable to itself and the technical org.

Product management is almost nonexistent, with the core product team


receiving signals from various sources, including sales/marketing, support,
etc. Product management should be based on business metrics which is not
currently present at <ACME> or not publicly known by the teams. This is
especially true with the rewrite of the V5 core product.

OKR

Objective

Implement a technical organization that operates as a single unit, including


product management, support, configurations, and core development.

Key Results

1. All technical matters fall under the technical organization that operates
as a single unit.
2. Technical operations are solid and predictable, with little to no friction.

6
]\

Actions

1. Implement a feedback loop that gathers configuration work efficiency,


defects, friction, and bugs. A product manager will collect and process
these on a bi-weekly basis.
2. Product management should be explicitly linked to a business KPI,
such as “lower cost of configurations by x%,” “speed up deployments…”
and so on.
3. Product management should review support issues regularly and feed
them into core product features to mitigate/eliminate them in a given
quarter.
4. Product management’s list of priorities should be publicly available for
the company to see and discuss, and no “walled gardens” should exist.
5. Implement a single technical leader who will oversee all technical
operations.
6. Executives should be asking for constant improvements that are
measurable financially.

7
]\

3. Aging Engineering Practices


The engineering practices at <ACME> date back to the early 2000s. This audit
discovered that deployments are handled manually, there is no unit or
integration testing in the core product currently (although this is planned for
V5), and configurations may or may not be tested, as adherence is based on
personal choice rather than forced through a deployment pipeline that
cannot be circumvented.

While SOC 2 has somewhat complicated the deployment process by


separating the developers from those who deploy the code onto client
systems, this is slightly misunderstood. This means that developers should
never be allowed to deploy code that hasn’t been reviewed by at least two
people, one of whom should be a manager. Furthermore, the developer
should not have direct access to a client machine; as such, the
implementation of CI/CD both fulfills SOC 2 requirements and dramatically
improves productivity and quality.

The lack of career planning is concerning, as developers don’t know how their
careers will move forward, which can exacerbate the already problematic
engineering culture. The teams that function well inspire the engineering
culture.

OKR

Objective

Evolve engineering practices into generally accepted practices implemented


by thousands of tech-first companies around the world. Allow multiple
deployments per day with a single click from within the engineering and
configuration teams.

8
]\

The primary objective is to allow deployments within minutes rather than


once per week.

Key Results

1. Deployments can be made within minutes with oversight


2. Single click restore of any failed deployment to the previous state
3. Active mitigation of performance problems as they arise through
alerting, i.e., “Alert on request > 250ms for at least 5 minutes”
4. Data-driven product management
5. Happier developers who will stay longer through a strict career
trajectory
6. Senior developers are not required to provide custom modules.
7. Core team developer onboarding in less than an hour.
8. Configurations team developer onboarding in less than 3 days.

Actions

1. Implement automated deployments directly from the source control


2. Implement automated testing for full regression testing prior to
deployment.
a. All new work and bug fixes must be validated with an additional
unit test, at minimum, to confirm that they have been fixed and
will never resurface again.
3. Deployment without the manual copy-paste of files by IT
4. Enable full local development capability based on a docker
environment to perfectly simulate a client’s production environment.
5. Enable logging and monitoring across all systems through a centralized
system such as DataDog or otherwise.
6. Implement a plugin architecture.

9
]\

7. Enable user activity logging in the front-end through the likes of


Sentry.io or FullStory (inputs can be obfuscated to protect privacy)
8. Technical Organizations must decide on remuneration and promotions
of developers within a strict “Career Engineering Competency Matrix.”

10
]\

4. Missing Technical Leadership


<ACME> had technical leadership in the past and was responsible for its
technical success until recently, when rapid growth caused <ACME> to stress
friction points to the breaking point. What worked before no longer works
now due to sheer scale and will crumble under the pressure soon enough.

This has become evident in the ongoing development of the V5 version of the
core product, which is being designed using the latest technologies. However,
when questioned about the business goal of the V5 rewrite, none could be
provided except for the statement that “it will be built better.” This is a big
problem, especially when faced with a 2-year development timeline with little
to no quantifiable business value and seemingly no executive sponsors.

It is this auditor’s opinion that it makes sense to evolve the technology on


which the core product is built. This is the right decision; however, the
approach of doing a full rewrite is risky, and there is a high chance that the
project will fail by being cancelled.

There need to be more experts in architecture, backend, frontend, and


database development. While the teams are staffed with smart and driven
individuals who can learn the tech and deliver the expected results, this does
not make up for the lack of technical experience of having seen such
evolutions firsthand.

11
]\

OKR

Objective

Redesign the entire organization into a technical organization with a


technical leader who will drive technology and engineering.

Key Results

1. There is zero friction between business and technical organizations.


2. The technical organization proactively works with the business.
3. There is a single technical leader who can speak for and with business
4. Engineering teams can operate without a technical leader for at least
two weeks.
5. A sense of confidence in technology and engineering.
6. Technology enables growth rather than restricts it.

Actions

● All technical tasks can be traced to real business value


● Technical leadership has an expertise in the technology
● Automate and enforce all processes through systems rather than
documentation.
● Product management practice that proactively and continuously
improves the product based on data.

12
]\

5. Configuration Team Efficiency


The problems experienced by the configuration team directly and
significantly impact profitability. The need to find more senior developers to
do configuration and light development work is problematic regarding
recruitment and retention.

This team's development process is very inefficient because of the lack of


tools. The deployment process is precarious at best since it is done manually
for each client. To make matters worse, only IT can do the actual deployment
manually by logging on to client servers and manually copying the files. SOC
2 states that the developer should not be allowed to deploy when the
deployment is done manually rather than automatically. This is exacerbated
by the need to manually patch the database for stored procedures and
schema changes.

Developing configurations and stored procedures involves “exporting” the


changes, compiling them into a zip file, and asking IT to deploy them. Apart
from the security flaws that violate the deployment file chain of custody,
deployments happen on Fridays at 2 PM. Given the very manual and
error-prone nature of the deployment process, the likelihood of significant
downtime for a client could occur or occur regularly but is also fixed just as
fast; regardless of how quickly they are fixed, such failures should never
happen. When asked about success rates, the information was not
immediately known as a KPI. This is referred to as the DORA metrics.

13
]\

Dora Metrics

For the most part, <ACME> is doing well through its teams' sheer tenacity and
professionalism; even with all the current friction, they hit the low-medium
bracket and should be targeting “Elite” for optimal ROI from engineering
dollars.

Developing custom modules is just as problematic, with this option employed


when the core product’s features can’t meet the requirements. An <ACME>
custom module is not a traditional module; it is a separate web application
with its own codebase and configurations that is compiled and deployed
manually by copying over the files. Custom modules are “unregulated” in that
the developer can do pretty much whatever they want to do, from creating
new database tables to even modifying core application tables, which is a

14
]\

major violation of any system designed to be customized and can cause data
corruption as well as irreversible damage. This problematic development and
deployment model indicates a product that is thought of and behaves as a
platform but not built like a fixed product.

From the quality assurance perspective, while the configuration team cares
about the quality of the output, the QA process is not enforced through
systems but rather through an understanding with the developers and
configurators, and they can choose to follow it or not. This was fine in the past
when growth was more predictable, but now, QA is falling in priority when
deciding whether to deploy now or wait another week, which will result in an
unhappy client. This is problematic in that it creates time pressure on the
team, which leads to cutting corners, skipping testing and, at times, failed
deployments on a Friday at 4 PM, which can then lead to a slew of other
issues. These downtimes and failed deployments are rare because the team is
well-managed, but even the best managers can only do so much. However,
the data is not tracked so the audit does not provide hard facts.

In any other technical business, deployments on a Friday afternoon would be


considered a cardinal sin if done manually; however, given that <ACME>’s
clients are mainly government agencies, this deployment timeframe is fine as
long as the staff is available to revert a deployment. Having said that, this
needs to change ASAP as it is not scalable and just waiting for a perfect storm
to cause havoc.

The configuration team’s efficiency should be considered be given top


priority, with investments made regularly to improve its KPIs every quarter
because a dollar saved in configuration is a dollar made. Currently, this is not
the case.

15
]\

Access to skilled developers is another issue that came up in the audit. The
first question that this auditor had was why there was a need for developers
in the first place. Configurations should be easy enough, and custom modules
should be very rarely used, given how little value they add back to the product
and the complexity of maintaining a separate codebase. The very nature of a
custom module is that it is not developed within a strict system or as part of
it, which allows developers to do whatever they want. And when push comes
to shove with a decision between quality and speed, well, speed always wins if
you allow people to take shortcuts.

There has been talk about creating a pool of developers that could be shared
between the configuration and core product teams. From an operational
standpoint, this plan is valid, but it does not take into account the developers'
will and desire. This is not an ideal career move for any sufficiently skilled
developer, which will cause recruiting and major retention problems.

OKR

Objectives

1. Configurations should be deployable anytime during the day; this


indicates a healthy development process, even if the Friday afternoon
deployment window is maintained.
2. Configurations and custom modules should be deployable without IT
3. Configurations and custom modules deployment should be reversible,
non-destructive and within a single click.
4. No developer should be allowed to affect a core table or create a
custom table.
5. QA must be systemized so no human can ever circumvent it.

16
]\

6. Continuously reduce complexity so that the configuration team would


need less skilled developers which will help with recruiting because
such profiles are more likely to take the job over seniors who will see it
as a demotion of sorts.

Key Results

1. Deployments can be done in a single click within 5 minutes


2. QA involves a full regression of the core product and a client’s
configurations/custom modules automatically before deployment.
3. 100% of all development is achieved within the core product without
manual deployments.

Actions

1. Implement and track DORA metrics from a continuous improvement


goal.
2. Eliminate any and all manual deployments, which should all be done
directly from the product.
3. Implement a plugin system for all configurations, custom modules, or
custom business logic injections.
4. Proactively improve efficiency as a business goal.

6. Core Product
The core product faces a wide range of problems, from product management
to architecture, deployment model, technical direction, and team
composition. Firstly, the core product team only issues an update quarterly
and receives signals from sales/marketing and executives. It is clear that there
is no product management practice whatsoever directing product
development.

17
]\

While it makes sense that product management is limited given the


competitive landscape in which <ACME> is competing and that its clients are
not as demanding as your garden variety B2B SaaS. It should not be
completely absent because product management is not simply the process
of determining features but realizing business objectives through product
features. Whether the objective is to reduce cost, improve profitability,
increase adoption, reduce complexity or otherwise.

The deployment model of the core team is equally problematic; while it


seems like it is working fine currently, it requires an outsized effort to push a
large update, which seems to be the reason why updates are only made on a
quarterly basis. V4, the version of the core product and how it is colloquially
referred to, does not have any automated tests required to improve the
quality, velocity, and cost of developing new features. This means that a
developer could break something and only know about it once it hopefully
surfaces during QA or, worse, is discovered by a client. Automated testing
goes hand-in-hand with reducing the development cost and mitigating the
impact of key employee departures.

The deployment model requires a cumbersome manual operation, which


involves copy-pasting core product files alongside the custom module files
that may already be present on the server, manually migrating the database
schema, and ensuring everything works well post-deployment. As mentioned
before, this is an antiquated, error-prone approach requiring extensive
coordination across multiple people and departments. Modern web
application deployment is invisible, nearly impossible to break and requires
zero human intervention.

18
]\

What concerns me is the redevelopment of the future V5 version of the core


product. I have not heard from anyone that this is a priority except for the core
product team. While their choice of technologies is fine, they need more
expertise in those technologies internally. There was no mention of any real
and measurable business value that this projected 2-year project will realize.
This auditor heard nothing from executives about the V5, which questions
executive support and willingness to undertake such an extensive project.
Furthermore, there is no principal engineer or architect present for the design
of the V5. While the team is motivated and excited about this project, it is this
auditor’s professional opinion that the project is starting off on the wrong foot
and has a very high chance of being cancelled prematurely.

OKR

Objectives

1. Implement self-deployment capability and over-the-air updates to the


core product.
a. This will also serve the configuration team
2. Upgrade the codebase and technology incrementally rather than
3. Minimum of 80% test coverage
4. Deploy more frequently once it becomes easy to do so
5. Product management should be able to align development tasks with
business value every time.
6. Implement a product management practice that evolves the core
product on a continuous improvement basis.

Key Results

1. Deployments can be done in 5 minutes or less multiple times per day

19
]\

a. Can and will be done are different; we want the capability to do so


because it will evolve the product to where we need it to be.
2. V4 to V5 upgrade is done incrementally with updates bi-weekly at the
very least.
3. Product management for the core product is all about continuous
improvement in a proactive way rather than its current reactive posture.

20
]\

Final statement
The problems <ACME> Software is experiencing are related to scale
exacerbated by an inefficient process. What worked before no longer works
today and will only get worse unless there is a serious investment made in
every aspect of technical operations.

The primary goal of the evolution of technical operations is to make all


technical tasks so easy to do that when people work hard; they deliver a lot
more than just working hard on tasks that are hard to do. The very fact that
this is an issue at <ACME> tells this auditor that the company has the right
team in place and that, with some adjustments, we could turn it around in
8-12 months. At that point, <ACME> will look very different than it looks today.

<ACME> has been running without a defacto CTO for long enough that they
may not need one in the traditional sense. The teams are operating
independently, yes, inefficiently, but they are operating; this is more than this
auditor can say about other companies in the same stage of growth.

Hiring a Fractional-CTO to oversee the transformation at the company is a


prudent cost-effective approach. The action plan on how to achieve this will
determined during an engagement as it is important to include the entire
company and working with them rather than imposing it. It is this auditor’s
experience that working collaboratively ensures the best results.

Oshri Cohen

21

You might also like