Software Engineering
Management
Questions for architecture
A cheat-sheet to help drive effective outcomes and facilitate inter-departmental communication.
Business Questions
Please describe the high-level vision of the business proposal, the problems it solves and goals it supports.
- Do they understand the current system thoroughly, including its strengths, weaknesses, and dependencies?
- Do they understand the business's needs and goals?
- Can they provide evidence of stakeholder engagement and a clear understanding of their requirements?
What business benefits do you anticipate from your proposed architecture?
- How does the proposed architecture link to these benefits?
How do you propose to measure those benefits?
- What are the cost implications of this architecture?
Infrastructure requirements, licensing costs, etc … - How will you measure end-user satisfaction?
User feedback, adoption rates and other engagement metrics.
What are the risks and risk mitigation strategies to this architecture?
What changes should we make, at an organisational level, to support this project?
What are the priorities for delivery? Can this list form the basis of the Epics in an Agile environment?
How is this architecture future-proofed? How easy it is to adapt the architecture to changing business needs?
Application Questions
Can you describe the data and application architectures in your proposal?
Variants
- Is there more than one candidate solution architecture?
- What are the pros and cons?
- How does the cost and time analysis factor into these choices?
- What candidate architectures were rules out any why?
Look for:
- Modular Design
Can each module can be updated, replaced and scaled individually without affecting the rest of the system?
(resilient and cost effective)
Do we provide wrappers over off-the-shelf products to make them replaceable if ever needed?
(abstraction over underlying functionality)
Is the application broken into small, loosely coupled services, which can be scaled individually.
micro-services - Auto-Scaling
Automatically adjust the consumed resources (and cost) based on current demand without manual intervention.
(elastic scaling) - Reliability
How many 9s are we targeting?
(this is a measure of uptime, 99.9% = three 9s, 99.999% = five 9s)
Is disaster recovery considered?
Can the system distribute traffic efficiently across multiple servers or instances?
(load balancing) - Monitoring and Logging
This is essential for debugging in the cloud. - Identity and Access Management
How do you ensure that only authorized users can access relevant resource?
Do we include multi-factor authentication?
Do we include single sign-on?
(IAM, MFA, SSO) - Data
Data should be encrypted both in transit and at rest.
How will we ensure the architecture handles data consistency across distributed systems?
(TLS, ACID Vs Eventually Consistent) - Defence
Is there a cloud-based firewall to monitor and control traffic to the system?
(WAF web application firewalls) - API Access (and security)
Is there an application gateway? Should there be? These provide authentication, caching, rate limiting, versioning, analytics, and monitoring of API access to prevent abuse.
(APIM) - Other Systems
What other technologies / products are you proposing to use and why? What is the strategy for integration?
Delivery Practices
What is your plan for migrating from the current architecture to the proposed one?
- Is the proposition minimally disruptive? How so?
- Is the migration plan incremental in delivery?
(strangulation pattern, piecemeal replacement, phrase-based on location or user sub-sets)
How engaged is the implementation team?
- Is the architecture using known technologies?
If possible, choose technologies, languages, and frameworks the team is already familiar with to reduce the learning curve and to help them implement the architecture effectively.
Will we provide training if new technologies are necessary? - Is the architecture sympathetic to delivery capacity?
If the delivery team is small, or lacks resources, the migration plan should focus on small discreet functionality that can be delivered according to their availability and should be designed to be easy to manage and scale without a lot of manual intervention. - Is this collaborative?
Is there a feedback loop to allow the implementation team to voice concerns, suggest improvements, and be part of the decision-making process?
How do you plan to govern the implementation of the architecture?
- Who will be involved?
- What are the decision-making processes?
What is the plan for user acceptance testing?
- How will changes to the architecture be managed once it is in place?
- Do we need to provide end-user training?
Questions for Engineering Leadership
A cheat-sheet to help drive effective outcomes and facilitate inter-departmental communication.
Technology
- What's our current tech stack and why was it chosen?
- How do we stay updated with the latest technology trends?
- What are the biggest tech-debts we have, and what's the plan to address them?
- How are we addressing cybersecurity and data privacy?
- What is our disaster recovery and business continuity plan?
- How do we measure the performance and reliability of our technology?
- What’s our strategy for scalability and are we poised for future growth?
Current Projects
- What are the key projects currently underway and what business objectives do they serve?
- How do you prioritise among projects?
- Who are the project stakeholders and how is their feedback incorporated?
- What challenges are you currently facing with these projects and how are you addressing them?
- How do we ensure projects are completed on time and within budget?
- What metrics are you using to evaluate project success and ROI?
Assessing Performance
- Are tech projects being completed on time and achieving their intended results?
- Is there a positive, innovative, and productive team culture?
- Are we meeting our resilience, performance and security targets?
- Is tech spending within budget and providing value?
Architecture
Technical Architect Responsibilities
A Misconception
If you want to build a house, an architect drafts the blueprint while a craftsman lays the bricks following that plan.
Similarly, in software, a developer may pen the code but it's not natively understood by computers. A compiler (or interpreter) translates this code into machine language for execution. This step is aptly named "build."
The developer is the true technical architect, crafting the blueprint that the compiler then brings to life.
So if the developer is the architect, what then is a technical architect’s job?
The Elucidator
Architects engage with users, developers, and business stakeholders to gather insights and perspectives that might not be immediately apparent. Engaging stakeholders is key to understanding the strategic context that might impact the project.
As business processes are mapped out, architects might discover novel ways of re-engineering them for better outcomes.
Architects rely on current state analyses to translate high-level business goals into tangible, technical requirements and actionable tasks.
The Translator
The translation from a product specification to a technical specification requires a unique blend of skills.
The architect serves as a crucial bridge that can perform this duty, ensuring that technical solutions are aligned with business objectives and that the product team fully understand any technical constraints and the choices that might affect timelines or deliverables.
Softer skills like clear communication, negotiation, and stakeholder management are essential here.
Architects cooperate with multiple departments and play a part in planning and estimation. Understanding both the business and technical domains allows them to serve both sides. They will also play a crucial role in assessing the impact and feasibility of new feature requests.
An architect who can code can quickly create prototypes to validate their assumptions, ensuring that the business architecture aligns with what is technical feasible.
Architects need more technical skills AND more soft skills than senior developers
The Facilitator
Rather than dictate, the architect should guide and mentor. Their role is to help the team to make technical decisions that align with the project’s goals, scope, and delivery schedule.
They should communicate the technical vision to the team and help the team understand how their work contributes to that vision. They should provide detailed documentation and diagrams, that help both software and infrastructure engineers understand what needs to be built and why.
Technical architects are charged with evaluating the business requirements, finding technical solutions and actively leading the technical vision to success.
The Developer
A well-thought-out architecture accommodates for future growth, making it easy to scale the application. It optimises resource usage, lowering costs, it keeps data secure, makes the system more resilient to failures and can serve as a roadmap for the entire team.
Done poorly, inadequate architecture leads to technical debt, code that cannot handle peak loads effectively, exposure to data breaches, poor development velocity, increased operational costs and customer dissatisfaction.
Given the severity of these extremes, it is important that an architect know where their design is accelerating the team and where it is under-performing.
How?
- Listen to other developers. Struggles with codebase complexity or frequent workarounds often indicate architectural issues.
- Perform frequent code reviews. The code will reveal architecture antipatterns or inconsistencies.
- Monitoring system performance. Performance metrics and code logs can provide early indication of inefficiencies that may be due to poor architecture.
“A software architect who is not in touch with the evolving source code of the product is out of touch with reality”
Craig Larman & Bas Vodde - creators of Large-Scale Scrum
An Ivory Tower
The term "Ivory Tower Architect" is used to describe a technical architect who operates at a high level, often disconnected from the realities of the codebase.
Architects without developer experience might not have first-hand knowledge of the challenges related to implementation, scalability, and deployment.
Architects who are more theoretical can sometimes propose solutions that seem more complex than necessary, without clearly defined benefits.
Being distant from the daily nuances of implementation could lead to a disconnect with development teams, potentially resulting in miscommunication and conflicts.
Without direct involvement, these architects might find it challenging to identify where potential issues originate, whether in execution or design.
A Pathway Forward
- Architects should always write some code, if only occasional prototyping.
- Architecture should never "hand off" to engineering. Use architects as pair programming mentors, design workshop coaches, etc ...
- A software architect is expected to be an active member of the Scrum team. This means participating in all the usual ceremonies.
Examples from Industry
Aaron Boodman, Google engineer.
Does Google employ software architects? Not really. Google doesn't consider that a separate skill. As you advance in your engineering career at Google, you may find yourself leading projects and this naturally involves more design and less coding. But nobody’s sole job is to “architect software”. Every engineer shares that responsibility.
Bob See, Principal Recruiter, Google Engineering
“Architect" positions/titles don't actually exist at Google. The engineering culture at Google is actually a bit allergic to the idea of people making technical architecture decisions and then expecting other people to do the coding/implementation.
From Apple's current job pages
In this role, you will be a member of the Platform Architecture team, working with hardware and software engineering groups to shape the architecture of Apple's future System-on-Chips (SoC). ... We are looking for SoC architects with a passion to innovative new hardware concepts and model them in C++/Python to demonstrate their value and impact.
From Microsoft's current job pages "Azure Solutions Architect"
Technical experience and knowledge of architecture, solution development and deployment techniques ...Ability to lead a multidisciplinary technical team, provide clear direction and take accountability for technical project outcomes
Gergely Orosz, the "pragmatic engineer"
neither Uber nor Skype/Microsoft have hands-off software architects
Diagramming - How to
We aim for effective communication. The following questions and pointers assist to provoke reflection on ensuring the proposed outcomes from undertaking this work are met:
- A target audience.
Who is the diagram for? Executives, developers, or just other architects? - A defined purpose.
Should this provide an overview or is it meant to show how components interact? (for example) - A cohesive narrative.
Even technical diagrams and documents should tell a story. How do the components of the diagram interact to fulfil the overall system's purpose? - A broad perspective.
The goal is to convey fundamental structure and behaviour, not every detail - A consistent visual.
Group related technologies together or use the same icons and colours to show similar elements.
If one line represents data flow it is likely that every line should represent data flow, etc …
SDLC
Diligence
User Interface Testing
We transform our user stories in Jira into a test plan. The test plan is then scripted to provide an automated, system-level testing of the user interface and overall customer experience. This can be run headlessly in an automated pipeline.
As with the code-level tests above, system-level tests are written to prove the existence of bugs before they are fixed. In this way we build a library of tests that grow over time and provide ever more reassurance.
User Acceptance Testing
We test the system in real-world scenarios with real-world users, to validate that it meets their requirements and expectations.
Non-functional Testing
Some or all of the below may be necessary
- Stress Testing
Assess the system's performance and stability under extreme workloads. - Penetration Testing
Simulate a cyber-attack against the system to identify vulnerabilities. - Failover Testing
Deliberately cause components of a system to fail, ensuring that backup components automatically take over (with minimal disruption to usage). - Chaos Engineering
Introduce random failures into the system to assess its resilience and fault tolerance.
Bug reporting
Bug reports should include the following attributes to assist in understanding and speedy resolution:
- Title.
- Description
Including what the testing objective was, what the tester expected to happen and what happened - Steps to Reproduce
A step-by-step guide to reproduce the issue. - Environment
Where the bug was encountered. - Severity and Priority
- Any relevant attachments
Screenshots, error messages, logs, etc
Develop
SOLID Principles:
- Single-responsibility principle:
We write methods that have discreet purposes and avoid side-effects. - Open-closed principle:
We recognise that both composition and inheritance are just tools in our arsenal and use each where appropriate. - Liskov substitution principle:
Object polymorphism is inherent in our choices of programming language. - Interface segregation principle:
We prefer many topic-specific interfaces over larger more general ones.
We avoid implementing interfaces we cannot completely fulfil. - Dependency inversion principle:
We integrate components with other via interfaces where there is the possibility of that implementation ever changing, including potential changes of implementation to support testing.
YAGNI, ‘You ain’t gonna need it’:
Good developers will often:
- Look for the generic solutions to the specific problem they are solving.
- Plan for future variance and produce highly configurable code.
There are many excellent development practises that contribute to clean and dry code (see below) but that can also be time-consuming and may not ultimately be needed. We aim to balance these against the YAGNI principle.
DRY, “Don’t repeat yourself”:
We look for opportunities to reduce the amount of code we write by encapsulating functionality into discreet, reusable functions. We balance this against the YAGNI principle.
Documenting / Commenting
Informative comments provide help to the next developer to work on an area of code, even when that developer is the original author at some point in the future. We write comments because we recognise that reading code is harder than writing it.
Complex areas of code are especially important to comment, often noting the paths not taken. Public APIs are specified using Open API (swagger).
TDD
Test driven development front-loads the overhead of testing and lowers the overall cost of development. We identify errors early and at a time then the code has been freshly written.
Unit tests provide evidence that a single unit of code works, typically in isolation. Integration tests, sometimes called “sociable unit tests”, provide evidence that a whole sequence of code works together.
When bugs are found we write new automated tests that prove the bug exists before we fix it. Although our initial testing is included in the cost of the story or task in Jira, bug reports that have been promoted beyond a developer environment are triaged via a new ticket in Jira.
Our pipeline collects test-coverage metrics when it executes our tests so that we can monitor and improve on this.
GIT Source Control
Semantic Versioning:
We use git commit tagging to designate versions according to semver principles.
Feature Branching:
Developers create feature branches named after their ticket
Avoiding ‘fox-trot’ Merges:
We rebase our feature branches on top of the master branch and then merge the feature branch back into the master. This ensures that origin/master is always the 1 parent in every merge.
Pull Requests and Peer Review:
Github supports a peer review process that our developers use to validate each other’s code before we merge it into the master branch. This process help raise the quality of the code written and can prevent errors and misunderstandings from entering the main work stream.
Security
Common Principles
- We adhere to the principle of least privilege and deny by default.
- We prefer guides as identifiers to guard against URL manipulation with direct object referencing.
- We exercise judicious CORS configuration
- We issue short-lived JWTs
- For data travelling to outside of our trusted environment, we always use SSL
- We rely on well recognised encryption algorithms, avoiding older techniques, and never try to write our own.
- We prefer not to store sensitive information where possible, including in a cache.
- We recognise the dangers of user-supplied content where it can lead to injection hacking. We sanitise input appropriately, including the use of parameters for database queries.
- Our pipeline includes a static analyser that scans for vulnerabilities, both in our code and in the third-party packages we use.
- We review the third-party packages we include with our software to make sure only those which are necessary are included.
- Included packages are pinned to specific versions that we have tested against, rather than being bound to the latest major/minor versions.
- We employ a third party to perform penetration testing. We redo this with every major release or at least once a year.
- We implement strict user password requirements on length and complexity and prevent common passwords from being used.
- Users are locked out, or delayed from attempting to log in, or presented with a Captcha when a given number of log in attempts have been made. This guards against password-stuffing.
Deploy
CI/CD
Branch Protection
The master branch of the project is projected from direct changes. Changes can only be merged into the master branch, and then only when certain conditions are met.
Feature Branch Triggers
Code can be pushed to a feature branch without consequence. Raising a pull request should do the following:
The PR name is linted against a Jira ticket.
The code is compiled.
The code is subject to static analysis for security purposes.
The unit and integration tests are run.We only run in-process tests here. Some integration tests may be run later.
Code coverage is collected
A new artifact is created for this build.
*A new resource group is created in the cloud and the artifact + any requisite data is deployed there using IaC.
*The system tests are run against the newly created environment.Integration tests that require an environment will be run here.
*Code coverage is collected
* Branch-level deployment is an advanced strategy and a good target to aim for.
Pushing to a feature branch with an outstanding Pull Request re-runs the above checks.If branch-level development is present the testing environment can be safely torn down.
Master Branch Triggers
The code can be merged to the master branch with at least 1 acceptance in the PR process. Merging into master triggers the same actions as above but with deployment to UAT
Release Management
Creating a new tag for a commit on master that matches the semver pattern triggers several sequential actions:
The Staging environment for this project is updated with the infrastructure code from the tagged commit.
The artifact that was created for the tagged commit is deployed to Staging. This process can be triggered manually by specifying the commit to deploy, which facilitates a rollback if we deploy malfunctioning code.
Infrastructure as Code
Our infrastructure is written in declarative yaml, it is executed as part of our pipeline and stored in source control.
Monitoring
Logging in the Cloud
In a cloud environment it can be very challenging to debug software. We should implement detailed logging across the system, both to monitor healthy activity and to shine a light on errors. We will utilise the cloud provider's native logging services for this
Health Checks
Health checks ensure the system is up and responding, and may be used to verify specific system functionalities.
Alerting
If an error occurs, or a health check fails, an alert should be triggered to notify the relevant team members as soon as possible
Design
Baseline Technical Specifications
Where an existing solution is already in place, analyse its implementation, assess the strengths and weaknesses of the implementation, and decide if any parts can be reused. Do this from a technical and user-oriented perspective.
Identify the architecture patterns, technologies used, data flows, security measures and so on.Identify 3 party interactions and their mechanisms.
rd
Target Technical Specifications:
Provide technical solutions that fulfil the agreed requirements. Identify software languages, service interactions, database choices, cloud providers, third party suppliers and other strategic choices.
APIs may also be defined here with the understanding that these are prone to change as the project progresses.
Security is planned from the outset, including where data will be stored and encrypted, defensive network topologies and both authorisation and authentication are considered.
Robustness is planned from the outset, with redundancy in storage and infrastructure across multiple zones or regions. Traffic management, load balancing and DNS records all contribute to this.
Scalability is planned from the outset, either with serverless infrastructure, VM scale-sets or containers and container orchestration. Database partitioning should be considered.
Where there is more than one viable solution, a list of the pros and cons should be collated.
Engage the development team early in this process, their input will ensure the design is technically feasible and utilises their skills effectively. Defining the technical architecture should be a collaborative process, with developers having the opportunity to propose their own solutions and clarify their understanding.Technical architecture is a guide for development, not a rigid plan. Allow for flexibility of implementation but ensure the overall structure and principles are adhered to.
Gather Team and Stakeholder Feedback
Technical assistance is offered to the wider team and key stakeholders to guide them through the implications of the choices they make at this level. The pros and cons of the viable solutions must be discussed and a strategy chosen that is aligned to achieving both technical feasibility and financial sustainability.