Tuesday, December 10, 2019

The W5 of developing software products

This post is from a set of slides I did in a presentation to a group of product managers. The title of the talk was "working effectively with developers"; and the general theme was to explain development terms such as "technical debt" and "re-factoring".

Once we started talking out what these terms meant and how they fit into a development process; a larger picture of what software product development actually 'is' started to emerge. So, to break this very large subject down into a manageable presentation I used the tactic of investigative journalism, and focused on how the 'who', 'why', 'what', 'where', 'when' and 'how' of developing products.

This is from a developer point of view, but drawn from experience of being both a developer and a product manager. When starting a software company from scratch, you really need to perform the duties of these roles at the same time, and you realize that the goals of both of these roles are the same:

Observation: At the end of the day the main goal is to solve a user’s problem with your technology; and they will pay you for it. With that money you eat, party, and continue the cycle again.

What is product management?

When talking to many of my product manager and development colleagues: one theme started to emerge pretty quickly. The definition of what "it is" was different with every conversation, and when referencing job descriptions that were titled "Product Manager" the roles and responsibilities outlined seemed to diverge just as much.

This role, which didn't really exist 20 years ago, is still finding its own definition. That definition is different in every group (company/team) producing software. Now, I see it as a catch-all to manage:

a Product (what?)
a Customer (who and why?)
a Process to make it (how)
the Schedule to deliver a solution (when)
a Team to make the solution (who again? yikes)

This sounds like everything! So, to do everything it might be a good idea to separate the Roles and Responsibilities to understand what gets done, by who, and when. Be sure to have an in-depth conversation to outline these responsibilities (and expectations) when taking on a Product manager job.

The Product Developer

From the developer’s point of view, there has been an equal amount of change in what a 'developer' is. In the most fundamental view, a technology developer has to perform 3 basic functions. There needs to be a design of a system, the implementation of a system, and the validation of the systems function and utility.

In isolation, with well defined design, this is possible in many technologies. Buildings are designed with blueprints, and built with construction methods, and validated with engineering. That works in civil technology (buildings, cars, trains), but for software engineering the methods are different.

Observation: Skill vary tremendously when talking about one person doing this, and software lacks formal education and certification on what it means to be an 'architect', 'engineer' or builder (programmer). This hasn't been standardized; so for a non-technical person to hire a 'developer' its a very murky understanding of what a person is capable of.

So, lets investigate....

The order of priority on the 'who', 'why', 'what', 'where', 'when' and 'how' matters greatly, but it all has to get done. Things differ in the type of company it is. Is it product driven or project driven? If you are selling hours as a consultant the accounting of hours and the estimation of effort take on greater importance. If this is a product then the customer relationship and Domain expertise take on a greater importance than the schedule. It really depends on what you are getting paid for.

In the view of the product manager, I believe the order of priority should be “Who”, then “Why” and the rest in random order. If those first two are taken care of, many parts of the what, how and when fall into place. This is probably an over-simplification, but I really want to state the importance of the Who and Why in the realm of the product managers priorities.

Why we build products

This is fundamentally the main purpose of product management. Understanding why a customer needs what you are building. They need to solve their problem, and in using your product they want to solve it in a more effective way then they are currently doing it.

How to do this effectively? Understand the Customer and their Domain, understand what their job is, what is the actual outputs are and why some outputs are more important than others. This Domain modelling is extremely important in a product driven company.

Stakeholder are everywhere! The customer, their boss, your boss, your team-mates and their boss. Everyone has an interest that all this goes well. The priority of who gets attention can get lost in a busy day, but in the end, the customer should win all debates

Observation: In a software company, you will work with smart people, and those people will have opinions (hopefully). This is a good problem to have, but it can lead to ‘analysis paralysis’ in figuring out what to do next. Use the Customers point of view as a unifying force. Create your personas and outline their problems, name them and use them to put light on what’s important. We all have opinions, but the customers opinion actually pays the bills.

What we are building: a technology

You can build buildings, cars or software, but the same fundamentals of Utility, Structure and Beauty apply to successful pieces of technology. It has to work, it has to solve a problem, and even better it has to feel good doing it.

Functionality – solves problems, or doesn’t
Structure – works great, or falls apart
Usability – feels great to use, or not

Observation: Functionality and a little Usability can work even if structure isn’t great. People love Ferraris even though they break down, and people loved Twitter when they kept getting the 'fail whale' screen. Despite the structural issues, the functionality and usability was there.

For architecture nerds out there, this is the fundamental pattern of architecture known as the Vitruvian Triad. It holds up as a valuable pattern even if it was defined 2000 years ago.

For Software, we run into a bit of an issue here. Software isn’t physical, it needs hardware, and the user only sees about 10-15% of the technology. I call this the ‘Iceberg effect’; since the user only interacts with the UI and doesn’t (and shouldn’t) be exposed to the inner workings of what is making diagram change when a button is clicked.

‘Tech Debt’

So, the term ‘tech-debt’ seems to strike fear in the business crowd, and for this developer its hard to understand why. Businesses (especially start-ups) don’t seem afraid of debt at all, and sometimes leverage the entire perceived value of the company for funding.

What is the current state of your technical balance sheet? You have assets if users are using features in your system, and you may have taken on debt to create them.

How does tech debt happen? Feature complexity is fun and easy to make when you ignore the costs. Same way that its fun to make dinner and not do the dishes

Keep a Source of truth on assets (tested requirements), and understand the debt incurred to make them.
Pay off a little debt each release. Improve the Usability, Performance, Security, Observe-ability, Portability and Maintainability of your system.

Observation: Debt can be very useful, but it needs to be managed; or the complexity of managing debt will overwhelm you.

How to build products

Process and culture have the greatest effect on product quality. its a people issue. Ad-hoc process makes ad-hoc products and the user can tell.

Earlier on the roles to build software were somewhat defined: The Business Analyst modeled the domain of the customer, an Architect created a model to reflect that Domain, and Engineers and Programmers implemented that model.

This was seen as a 'waterfall process' and while that worked for well defined and small products, things really fell apart in the larger products. Why? its because the understanding of what is being built changes over time, and that directly relates to the requirements of the system being built. Once the requirements become fuzzy, or not well understood, its really up to the mind of the implementer to get it 'right'.

Observation: This phenomenon has been described as 'changing requirements' but the reality is that the requirements of what is needed don't really change; you and your teams understanding of what the requirements change over time. In the language of developers: you re-factor your definition of the requirements as you learn more about them.

How about that mock-up?

From my angle, a detailed mock-up is helpful, but you seem to want a lot more. A UI mock-up is like defining what a room looks like when you want a house. What really matters is the experience using it; and I see this getting better quickly with design process and (CX, UX)

Observation: Define Done to have a chance at finishing (or actually estimating work)

Good requirements force simplifying through testing. You need requirements to define, build and test and software

Releasing

Observation: You will learn the most about your product, team and process when you release.

Plan and be strategic all you like about the future; what you have today has the greatest effect on your success.

version everything, its the last version not the date that matters
shipping to customer isn't done. reviewing and cleaning up is done.
release through the company to get to the customer. This allows collective knowledge to be set to the new reality with every release

Build planned work, and respond to requests on different boards. Use a feature driven process to build new features, this is something you can plan. Don’t bother trying to plan and estimate bugs, just use a different mechanism to deal with it.

Who is building the product?

Try to use small teams for features and working groups for cross-functional items. The same group of people can work on many different teams. They do anyway, so just put some structure around it

Meet and measure on outcomes, and only measure things that can’t talk. People talk, so talk to them and don’t try to apply metrics to their output. This builds communication, collaboration, trust, and respect.

Observation: Separate support requests and new feature development

If you are injecting work into your sprint, then there is very little value to any planning and estimation.

Kanban is great for support work and random requests. FDD, XP when used with Scrum can work for feature development (if emphasis is on requirements, not estimations)

Wrap it Up!

Priority is Who/Why

Build around customer problems
Customer development never stops, close the loop with customer validation.
Domain knowledge is more important than technical knowledge

What

Useful, Functional, and Usable will win
Beware of the ‘iceburg’ effect; it all has to get to get done.
Continually remove complexity, don't allow debt to overwhelm you. Complexity of your own creation is the real enemy.

When

Define what ‘done’ is to have any chance at correct estimates

Done is finishing the dishes, not just serving dinner.

Less is more, slower is faster

Close Loops. If the issue originated with a customer, it's not done until the customer is happy with the solution.
Realistic requirements, not just UI mock-ups
Version everything and make smaller maps into the future
Involve Developers early and over communicate, as collective ownership really works.

Tuesday, October 15, 2019

AI can help, if trained to do so

AI can help humanity; but only if it isn't trained with human biases and ideologies. This is the same with raising and training people, but with a machine it could be easier to get better results.

AI is coming for you, or is it?

There seems to be a large amount of both excitement and fear around the emergence of artificial intelligence. Will it take over? What is it? Can it save the world? Those are all very big questions, and for a new technology to generate these sorts of questions and to actually make us question its impact it could have on our lives. This really illustrates just how big people think it is, or could be.

I'll save many of the explanation of AI to much of the published articles and just sum it up: AI is a new brain, we have made a new brain without giving birth to a new person or other organic being, but instead building a machine. This brain has a tremendous amount of capability, but it doesn't have the knowledge required to be useful yet. It need to learn, and be trained.

What seems to cultivate fears in people is the amount of time ti take to provide the brain with knowledge. People take years of care and feeding to become thinking adults, AI machines can seemingly take minutes or hours to learn analogous tasks. Why would we be scared of that capability?

For good or evil?

How is it that just because the someone is smart that the others around them assume their intentions to be good, or evil? Just because a person can overwhelm another physically or mentally, does it follow that they will? That seems to be a reach, and we might know better by understanding the impact good people have had. It would follow then, if the person knew the difference between good and evil and the consequences of each school of thought; then they would make decisions with the intention of the good. This understanding of not taking a piece of the pie, but instead making a bigger pie and sharing it shows enlightenment of that persons intellect. The training of the AI machine may not be any different.

Are you scared when you meet a smart person? Are you scared when you meet a dumb one? Maybe they deserve the same reaction; but in the meeting of people during the course of a day you will surely generate a lot of fear for yourself.

Decisions, decisions

Why then do people make bad decisions? There are biases that lead them down bad paths. They fall victim to their own deficiencies to justify actions that are vain, greedy, gluttons, envy, etc. They just don't seem to be able to help themselves. The AI brain may present an opportunity to provide decision making that is free of these organic human constraints.

Humans are able to reason and solve problems effectively; but run into difficulties when the brain is clouded by ideologies; whether they are political or religious ones. For AI to help solve us solve some of our own problems, it can be trained to do so; as long as that training isn't an ideological nature.

Bias

Bias needs to be learned. For the various biases that complicate people making good decisions. Confirmation bias, survivor bias or loss aversion; these are learned behaviors and reside in the sub-conscious to be triggered at any time. This can happen without the host really know what is going on.
This is also not consistent. People have bad days, and the anxiety of having a bad (or good) day will affect your decision making. As long as the electricity is on; machines don't by nature have bad days.

To Train AI properly, the validation of the training needs to incorporate not just what the machine knows, but the validate what it doesn't know, and then provide the training so that the machine is adverse to learning the bad habits.

Is this not the same as raising a child or mentoring an apprentice? It might not be that different in what we are doing; but how we do it can be improved to get better outcomes in the decisions the machine (or person) makes.

Thursday, September 05, 2019

Building Technology for the non-developer

Here we try to understand the complexity and level-of-effort in building software, and how it relates to other technology. Software is generally a 'mystery box' to most people, so using analogies to houses, cars or boats may make it easier to understand what's involved.

Building any technology can happen in the small or in the large and there are various forms in between those 2 points. Whether it is a house, car, boat or software; there are varying forms of each 'technology' and to actually complete building each form requires different skills.

Developing something to completion requires managing complexity. Very small examples may not have a lot of complexity. They are difficult to do well; but in the context of the larger picture they are the easiest form.

What

Developing any technology has varying levels of complexity, but generally follows a pattern depending on what form it is.

interior/exterior look and feel
small structures
medium structures
large structures.

Lets start with the easiest form. The look and feel. In buildings this is 'interior design' and in software its 'user experience design'. Getting this right takes skills, but its the easiest in terms of complexity.

Small structures. In buildings this would be a shed, dog house or tree house. In software this would be a web page.

Medium structures. In building this would be a functioning house, in software this is a functioning application.

Large structures. In building this would be a multi story building, all the way up to a skyscraper. In software this would be a system. Systems are embedded in other machines, like cars and factories, and are operating systems on computers.

How

To create small structures, you don't need too much formal understanding of the architecture, engineering or fabrication techniques used. As the complexity of the technology increases, the level of effort increased exponentially. I can build a tree house, and it will probably stick around for a while, but we couldn't build a 6 story building without some formal understanding of the architecture and engineering behind it. If you or I tried to build a large building without that understanding; the building would probably never get built, and if it did manage to stand on its own; it probably wouldn't for very long.

When

So, how long would doing this take? That entirely depends on the people involved. If you have doing some decorating in a house, or some web pages edits; its usually one person doing it. If you are building a small building or a complex application there will be many people involved; and all having different roles and responsibilities.
Skills are hard to transfer when they are specialized. So, when building a house you would have a plumber putting in pipes and connecting that system together. That person would be good at that, but how good are they going to be at interior design? And vice-versa, would you trust the pipes if they were put together by an interior designer?

Full-Stack/Master Crafts-person

Can one person do it all? It really depends on what you are making, but these people are quite rare. You can run into people that understand the basics; but the actual skills to build that technology aren't the same.

You may have expectations that one person can do it all at the snap of a finger. That is your personal problem of having screwed up exceptions; and has nothing to do with the people actually doing the building. This happens quite a bit with home renovations. A person may watch some TV and convince themselves that a full kitchen can be done in a weekend. This isn't based in any reality; so those expectations are completely wrong. It's not the fault of the people doing the work, it's a personal problem of mis-guided expectations.

Wrap it up

To build technology takes skills, but it also requires a good mind-set and attitude. Being pragmatic and persistent in solving hard problems is the skill that creates good technology. From the outside, having realistic expectations based on evidence and understanding will allow good technology to be built. The time it takes is how long it takes; no matter how off your expectations of how long it should take are. Those expectations are rarely correct, no matter the technology being built.

Monday, June 03, 2019

Math is both a language and a useful art.

Math.

"Gaa!" is usually the reaction I get to that word, and I think that's too bad.

Why do I need to know it?

The reasons to learn math are really the same reasons we need to learn to read and write. Its not to remember words and recite what letter comes next; its so we can communicate with other people and understand what they are talking about, even if they aren't there.
We communicate to understand each other, and our language is made up of the bit and parts we learn by reading and writing on our own.
Math is a language that allows us to understand the world around us, and how it works. Its the language that we use to build technology.

Learning to read and write a language allows you to communicate. Its not about know what exact pronouns and adverbs go here or there; it gives you the tool to communicate with other people and understand what is happening.

Similar with math; the point isn't memorizing a times table; it's about having the tools to solve problems. If you can solve math problems many other problems just become easy to solve. Like getting employment and doing your taxes and not getting ripped off by the slicksters you will run into during your life. It just makes life more enjoyable having more tools in your toolbox.

By learning where math came from and why we use it, we can appreciate and understand it much easier than repetitive memorization. Math is the fundamental technology, and how technology is defined will help understand that line a little better.

Math is a handy tool we use to understand the world around us. People made it up, and use it for many purposes. Tools can be mis-used; but the tool itself isn't something to be scared of, even just a little bit.

Art

Math is art, with very practical uses. How is it an art? If you learn math as an art the practical aspects will emerge.

Reasoning and critical thinking
Elegance of solutions. Math is good when the solution has removed all needless complexity.

How do math relate to other arts?

painting is both art, and practical when you paint rooms in a house
music is art, is it practical? would the world be better off without music? of course not.

Practical is problem solving

A defined, repeatable structure of problem solving can be transferred to any parts of life. Use the How to Solve it steps for any problem:

Understand
Plan
Execute
Review

Models that reflect real life

Why math? The patterns of life are all around us can be added up, so they were. There are hours in a day, and things to do. There is stuff to measure all the time, so a way to measure it was needed that everyone could understand. How do you count when you are small and learning? Usually its with your fingers: one, two, three, four, or five fingers can signal between two people the count of something. Once we got past 10 things to deal with, then it can get a bit complicated. This is where the tool called math shows its most basic and important value; the ability to model the real world with symbols and notation so we can understand them, and be able to understand the same thing together.

You can create it all with the basic fundamentals, so learn those techniques and you will realize that more complex solutions are just extending these fundamentals. Don't Memorize solutions; memorizing math is like memorizing colors and shapes. Just create them with the basics

The real world problems early math was dealing with and help solve were not that complicated. Way back, you would have been farming, or maybe making pottery. If you are farming, you would have to know where to plant the crop seeds, or tend the herd of animals. How to measure this land?

To measure the land people used numbers to indicate how many steps (feet) they took around the land, and this was usually the shape that we know now to be a square or rectangle. (graphic). Now that all this land has been measured, it can be measured again to divide up what goes where. All of those smaller pieces can be added up to make the whole piece.

Adding (+) enabled us to go past counting on our hands, and was soon followed by subtracting (-).

At this point in history we just have numbers and geometry. This was the world of math for a long, long time.

Numbers

In the west, roman numerals were replaced with the set of numerals. 117 is easier than CXVII, and enabled the same operations to work on different bases then 10. Base 2 enables modern computation. These numbers originated in Arabia and India. Lots of important math advances occurred there while western Europe was in the dark ages.

Commerce

As people interacted and traded with each other, they needed to know how many potatoes they were trading for those 4 chickens. Currency was used, and basic math ensured that people could trust it as a mechanism to trade fairly.

Time

This did a whole lot to help us work with each other and understand how things worked in the world around us. How far something was and how long it would take to get there could be calculated.

Geometry

The size of the world was becoming comprehensible once we realized it wasn't flat, so the geometry of a sphere was discovered to understand that. It also works with a soap bubble and a basketball. That's a powerful tool!

People

It was a lot, and now valuable, so the sways of people and the ideologies they brought with them shaped the development of math and its understanding in the general public. This continues to this day!

Technology

Technology is a term used to describe the set of tools that we have made for ourselves. Math is the underlying technology to it all.
The size of the house and how big or small that can be. This enabled more and more technology through the correctness that math enables in engineering and architecture. Houses, buildings, trains, automobiles and airplanes, followed,
Computing and computing machines is an example of applying many types of mathematics to enable many amazing things we have around us today.
To build any technology we use math to define and tie all the components together, this is why its the fundamental technology

Who does math?

This is a list of my favourite characters from the long history that mathematics has.

https://al3x.svbtle.com/alexander-grothendieck

Galois

John Holland

Euler

Polya

Wrap it up!

English and other human language allow us to understand and interact with people. Math allows us to interct and understand things that can't describe themselves.

As a human invention, its really amazing that math is a language that continues to explain the models of bioology, chemistry, physics. Is this a limiting factor to the further understanding of these subjects?

History of Mathematics courses at the university level, what is astonishing to me is the organic nature of the development of mathematics. The complex mathematical rules that govern the universe were developed many centuries ago from a simple need to describe the concept of a number.
It was only through an iterative process of experimentation, modeling, trial and error, approximation and documentation that we as humanity were able to make the many leaps in science, technology engineering and mathematics.

Computing technology now provides us with tools which can expedite these experiences and learning. We should embrace these tools and take mathematics to a new level by empowering students to write about their experiences. Let’s also not forget the power of the incredible, easily accessible technology known as pen and paper.

Monday, April 22, 2019

Scaling your web application for many users

There are many patterns of good and bad practices used when designing applications and scaling databases. This post is from lessons learned using data and databases with dozens of applications; starting with creating slow applications early in my career and scaling slow applications I didn't create later in my career.
It turns out being able to see the application through its metrics is the key to making them fast. When you can observe with clarity you can fix the major issues and know those issues are the actual performance problems. To that end let us:

Understand what makes an application busy and slow
Understand the data usage of highly available scale-able web applications
Identity data bottle-necks in servicing requests
How to monitor and let your metrics steer your strategy for what to do next.

A busy app

We've all gone to a website when it's busy and see it take 'forever' to respond to requests or just not be able to service any requests at all. Why is this happening? Its the same fundamental reason we wait in line at a checkout at the department store; the infrastructure involved can only handle a certain amount of throughput. Scale-ability is all about servicing requests, whether its a fast food restaurant or a software stack.

Littles law helps understand this phenomena: https://en.wikipedia.org/wiki/Little%27s_law

So, to make an app faster, it has to be more efficient at using the resources available to service the requests its receiving; so you have to understand the life-cycle of the request from start to finish.

If modifying the software isn't an option, then more resources are needed; but piling on more memory and CPU speed will only take you so far. Scaling out your infrastructure properly requires software that utilizes it's environment efficiently.

Where does the data go?

Before you can make your app perform its work in a more efficient way, you will need to understand what data the app is using and where it is coming from.

Review how the data is used, and classify what is temporal data and what needs to be persisted. I put those concepts in the previous post:

https://jseller.blogspot.com/2019/04/database-design-for-web-applications.html

Profiling an application

At a previous start-up I was newly hired to lead the development work. During the interview I asked how the app was doing, and the reply was: "its great, but its a bit slow when people are using it. We just paid a consultant a bunch of money to put it on AWS so it should be fine".
That a pretty typical response, so I proceeded.
The first day I opened up the dashboard on AWS and the application was deployed on on instance and the database on another instance. Both machines were running at 100% CPU and consuming a bunch of memory doing it. There was one application log that had thousands of lines of exception stack traces. The bill for all of this was about 16K a month, and the budget was about 1K; so the super-green business founders were panicking.
Where do you start from there?

Here is my General Strategy:

Build architecture that evolves, better ideas will come, but get on with it

Measure and get clarity of what the system is doing, not guessing what it could be.

Things break. Any technology that is run hard will break over time; so just deal with it.

Monitor

Sometimes AWS will swap out a load balancer at 3am when you are on vacation; stuff happens. So, its key to have monitors on all your components, with some thresholds that alert you when they have been crossed. Things like % cpu on a database machine, connections to a database, and number of requests a second. Monitor everything you can think of and fix your logging so you have clarity as to what is happening whenever you want.

Good Practice: Software isn't a mystery box. all of the components work together (well or not), so understand what is a good and bad state of any component and monitor it. Cloud providers have their own logging and monitoring tools, or use a service like datadog to get some reporting on your setup.

Tasks

In your application, you may have dependencies that take some time to complete. Sending emails and generating reports are usual culprits. Use a task/message queue to get those pieces of functionality to run separate from the thread that is servicing the users request. So in the case of email, have an email task that runs separately, so your app just starts that task and continues without waiting for the email to send.

Scaling an application

The app can be horizontally scaled as long as its a stateless application; so it is not keeping data in the memory in the machine or disk, but writing to and reading from a central database.

Caching
This is why it is a key architectural practice to centralize the temporal and persisted data. Using in-memory caches and local data-stores will introduce complexity around synchronizing the outputs, and then your scaling limit is the hardware and OS on a single machine.
When caching data, Understand how old the data can be, how stale is stale? Be careful when using expiry times in caching and in general just don't use expiry unless you have to. Pushing out old data with new will get you far.

Monitor and measure requests to the application

See your log files and machine stats (htop on linux is the classic)

Scaling a Database

If the application can be horizontally scaled, the database is generally the prime area for bottlenecks. Web applications servicing many requests need the database to be consistent and accurate. How do we get the data into the database efficiently so we can get it out quickly?

Connections

First the application need to connect to the database, and the connection itself can be expensive to create; so monitor the connections to the database. Use a pool of connections and borrow from that pool instead of recreating new ones for each request.

Replicate

Use primary (write) and secondary (read-only) replicas to scale. Your app is probably 90% read queries, so there lot of room to grow there.

Route read-only requests

In your application code you could route some read-only queries to the replica directly. This is only a good pattern if the request is read-only in its entirety, and not updating any data to get results. The replication in the database is fast, but happens in a different thread so your results will show that.

Cache some results

In your app you have queries to the database or calls to dependencies that probably aren't changing a whole lot. Cache these results so you don't overload the db with redundant queries.

Determining Concurrent Users

1. Get max throughput (req/sec), with no wait/think time. This is a test scenario, but remember that actual usage is more variable. Users will make requests and ponder the results before taking a new action that results in another request. Factor this in to find an average time between requests.

req per sec = concurrent users / (response time + think time)
concurrent users = (response time + think time) * req per sec

Using a target response time you can determine the systems capacity. Draw a graph with multiple measurements to find out when contention starts and the system starts degrading. Graph for a base (1x usage) and an upper bounds (5 or 6x usage).

2. Optimize for hardware, CPU speed very important for db machine.

memory is cheap. keep data in memory, the db is the backup
RAM should be 3 to 6x the size of database, db should be in shared_buffers
stack size should be a multiple of page size, between 2-8 mb

Test and Measure

Monitoring and Metrics

Log and understand the traffic and find out what queries are running long. mysql, postres have slow query logs, so try logging any query that takes longer than 5 seconds. I have discovered some 30 second queries that have bottle-necked many apps this way.
Analyze the query to understand what the execution plan is.

use ANALYZE EXPLAIN to get execution plan

Check indexes (in memory btree to enable fast searching). index on fields that are used in queries

What to watch for:

database IO
database connection time and total connections
database machine CPU
database machine disk io
cache hits

You can and should put alerts on thresholds for any of these metrics. An example would be an alert on 80% capacity of total available memory.

Reporting

Create trending reports, is a pattern of steady growth? Response times should increase in a linear fashion but level off. There could be a bump for the first few requests if some caching and resources are using a lazy-load strategy, but once those are done the time for new resources should be a linear line over time.

Test Data

To replicate a production environment, or build up enough data from testing to have enough confidence the data will somewhat match a production situation. You won't be able to replicate the production load; but that isn't really the point; you are validating that your instrumentation and metrics work and provide an accurate picture of what is happening. Observe-ability is the key outcome here

Types of Tests

Load - what is the throughput of the system? Establish baseline metrics for how many requests a system can handle

Capacity - how many concurrent users can the system handle? Establish maximum users and expected users

Endurance - with expected users, run long running tests. Establish baseline hardware and other environment parameters

Read some more.....

http://highscalability.com/
http://thebuild.com/blog/
http://reinout.vanrees.org/weblog/2012/06/04/djangocon-postgres.html
http://venkateshcm.com/2014/05/Caching-To-Scale-Web-Applications/

Wrap it up!

Building a scale-able, high performing application is really about knowing the components in the application and how well they use the environment around them. If something 'feels' slow, instrument and measure the pieces to see what the issues are. The key is increasing your serviceability of the application so you know what components are slow and where you can put work into increasing the overall through-put to keep the users happy.

Tuesday, April 16, 2019

Data design for web applications

There are many choices of database technologies to base your application on; so how do you make sense of the options? Are you making a single user application or a service that can have thousands of users?
For web applications that have many users, the logic in the application builds more than enough complexity; so its important to simplify what tools you use around the app. To this end it is good practice to make safe and sometimes boring architectural decisions when it comes to your database.

In this post we:

Understand what types of data are used in an application.
What database and caching technologies are best fits for the different types of data.
Go through some guidelines to create a solid datamodel foundation for an application that can be scaled for more users later.

Data

Firstly, you need to understand all data isn't the same, and needs to be stored according to how it is used.What sort of data are you saving? How long does it have to live, how fresh does it have to be? Let this dictate your strategy for what system you are using.
Web applications can sometime read a little bit of data from many places, while more analytical data app can have queries that can take a long time. We get into analyzing queries for performance in a different post since its important to understand how that works no matter the type of application the data is used for.

Data analysis applications have a large amounts of data, but less concurrent users than web applications. The ingestion of new data into a data warehouse is handled by pipelines that are somewhat deterministic in nature; you control when and how it happens. With a widely available web application the data is updated whenever the users are using the app, and that doesn't follow a consistent timeline.

Its important to know how 'hot' the data is; is it being updated constantly in small amounts, or is it updated at once with a lot of data to be analyzed later?

So, classify your data along the lines of:

Data is short living and is used often.
...variations...
Data is long living and used rarely.

There is a big range in-between so its important to understand what data you are dealing with.
In the context of this post we are talking about using data for web applications that have many users.

Temporal/Dynamic Data

Data like session tokens are considered temporal or 'ephemeral'. This type of data has a short lifetime, and needs to be accessed quickly and frequently. Session tokens, permissions, feature flags are examples of this. They are created and used while the app is running and disposed of after the app has stopped or a users request is finished.

Pattern: Use a dedicated caching app (redis, memcached, etc) to store this type of data. Memcached and redis are great options. I have used/abused redis for many use cases and its been fast and correct no matter what the job I gave it.

For web applications, there will be assets used by the pages, like style sheets, javascript and images. For these types of artifacts they can be loaded by the webserver each time, but they don't change between requests,

Pattern: for static assets, push the persistence of them towards the user. Its good practice to cache these types of artifacts in front of the application with a reverse proxy or push them out to a CDN. This is easier than it sounds and has a tremendous effect on the user experience.

For reverse proxies, try using nginx or varnish as a quick and easy way to cache heavy artifacts up front; so all your application cpu cycles are solely spent on application requests and not serving the same file time and again.

Persistent data

When you model the application you are building, you will end up with some first class objects that need to have their data representation stored for long periods of time. Like a user object. Instead of the user creating their account each time they want to use your site, they create it once and access it many times afterwards.
Some types of data don't belong in the database at all. Logging is a great example, this is data the application produces and may seem like a good candidate to keep in the database to query for usage; but this is much better handled by logging systems. You are probably looking for a report that can take a long time to complete, and that would compete for resource time that is better left to service user requests.

What kind of database?

Like programming languages, its the understanding of the context of the problem that gives you the most insight into what to use. Options are many, saving files on disk, using various relational dbs, using new document oriented dbs.

Why not save to files?

Maybe you have made applications on your desktop and have saved data to a file. Why wont that work on a web application? Its because with web applications you have many users using your application at once, instead of just one user at a time. To support many users at once, web applications are multi-threaded; so you have many threads trying to access, update and delete items from that single file. This is called 'resource contention' and you would have to write the code around the file operations to be thread-safe.
Exercise: Try this out with a unit test that starts many thread that operate on the same file and you will see why quite quickly

Why not use the session store?

When building a web application you always have access to the session store, and it may be tempting to store data in the session and then resolve the final results in the database when the users request is finished. This is very problematic as the data is changing with many users making requests at the same time, so the validity of the data gets lots if the transactions are done whenever the request or long-living session is used. Its just more to keep in sync and not worth the extra complexity.

What does all this look like?

For a basic web app, the application uses its own memory and a database.

Pattern: Separate the storing of data in more appropriate technologies

Databases...

There are different types of database technologies. This post largely covered relational databases, but there are different types of databases you can use, depending on your needs. Document dbs like mongo, and couchbase have great feature sets, but can be tricky to figure out how to use properly. Do you have the time and budget needed figure them out? If you are building an app that does the standard CRUD operations, its probably a good idea to pick a tried and tested RDBMS and then let the usage of the application help you make decisions about what to do next. Different tools are just better for different tasks. Aerospike looks awesome in this space, with more coming up all the time.

All the multi-threaded access and ability to model your entities as data objects are in modern databases; so use them!

For Relational Data I have used Postgres with the most success, but MySql will give you good results as well. Oracle apparently has a database; but its rarely worth the money, or time lost talking to their salespeople.

NoSql

Lately there has been a trend to use non-relational databases, but for many cases this is a shiny toy; you can run relational databases for many thousands of users, and the technology is very stable. New databases like mongo or couch are document oriented, and may seem like a good idea, but keep in mind the added complexity you are taking on.

Cloud

Internet companies like google (spanner) and amazon (dynamodb) have created db systems for their own operations. Its fun to think of your project needing this technology; but let the usage of your application dictate when to move to these types of solutions.

Keep in mind you can run mysql or postgres in the cloud, with AWS, Azure or GCP, and they allow you to provision machines and monitor how the db uses the machine and responds to requests.
Tip: If running on AWS and you are using mysql, switch to Aurora, its a drop in replacement and will save you when running the instance hours required to service your application.

One database technology I have found to be very useful for basic webapps is the cloud datastore in Google App Engine. This scales automatically on usage and is very cheap to run in the small.

Design Steps for a datamodel

There are lots of possibilities and different solutions are useful for different problems. Here we are talking about application SQL databases that support many users and the data about them and all the data being used by the application itself to satisfy the intended use case.

SQL
Everything in stored is a table with rows and columns. Use SQL to query these tables.The relational model is a flat, 2 dimensional view of data; everything is in a table made up of rows and columns. You can do many things to represent the data to the outside, but it still resides in tables.
Learn SQL so you can look into your database without the application running with the db admin console.

Pattern: Basic steps for creating a datamodel

Identify the entities and attributes

usually the nouns in the spec about the data

Determine the attributes types

the types of the fields for the data

Identify the relationships between the entities

Patterns of common usage appear here

Add system fields

id primary key, timestamps.

Normalize enough

but not too much

Index fields that are searched by the application queries

measure usage to see what to index

Entities

You can think of the database for the application as the foundation of a house. If its built to last, your house can be large. If its brittle, the house will fall down.
Take time when discovering relations between entities and use Normalization techniques to create a solid foundation for your application.

Saving Objects
Depending on your database and language of choice, there are tools available to map the objects you have in your app to the actual data persistence mechanism. Transactions are atomic, so make sure to prepare everything that is needed before executing the transaction. Have the steps repeatable and reversible so the transaction can be 'rolled back'.

You could be tempted to construct SQL strings in your application logic, but this approach becomes complex very quickly. Look for Object Relational Mapping tools and use them instead of rolling your own.

Identify and create relationships
Try to flesh this out with diagrams and models before throwing foreign field keys in the tables. Many-to-many relationships need to be broken out into another table of only primary keys.
Understand referential integrity will save you in your application not having to validate these and adding complexity that can be dealt with in the database.

Normalize the entities - just keeping redundant data in it's own spot. Watch for doing too much here as performance can suffer with the resulting joins needed in the queries. Normalizing the data makes sure there is only one instance of that data, this can then be represented in multiple views.

De-normalize for performance

3rd order normalization is probably enough. Too much normalization will put complexity in on the queries to join results and could cause performance problems at scale; but that scale needs to be very large for this to be a problem.

Pattern: Data needs to be indexed properly for efficient queries. It's like books in a library, the new books aren't just lumped into a pile; they are labels and placed in the proper location.
When indexing a database, take care to understand what parts need indexing and why. Indexing is necessary, as any lookups would be slow otherwise.

Migrations

When the application gets developed over time, the database will change. Columns and indexes will be added, and some types changed. How to we change the structure of the database as we go along? This is commonly referred to as Migrations. Migrations for relational databases are sql statements that alter a column or a table for changes in the application. Be very pragmatic and disciplined about migrations.
Pattern: Version the migration the same as the software version, then its easy at a glance to see what migrations are needed for any given release.
Many platforms have built-in migration features. Django for python has a good way of keeping this organized. JOOQ for java is very slick in keeping migrations aligned.

Wrap it up!

Designing your database will take your application far if you take some time to get it correct. You will learn a lot about the application and how accurate the requirements are by taking some time on the design of your datamodel.
Understanding the relationships between entities, what entities are used more often, and how much 'work' the database can do for you instead of writing that logic in your application code.

Use the tools for the job. storing data in its appropriate place will reduce complexity in the application and allow it to grow.

Friday, January 25, 2019

Scrum, XP, FDD or Kanban? Use all of them!

There are lots of terms in the agile development world; its clear that we still figuring out the best way to make software. Here I illustrate the use of software development methodologies and what I have seen work in past projects. A lot depends on the project, the people available, and what the end goal is. What my intention is to illustrate that one process or methodology isn't the answer; but instead the most effective process depends on your situation. Its the same as using a screwdriver with nails, or a hammer with screws. Be pragmatic and use the tool that is appropriate for the job at hand. The main goals is the same: create quality technology.

Scrum time

How do we go about doing this? Let's firstly understand how Scrum can be used for organizing how people work together to build both projects and products. Scrum can be used to build anything; a bird house, a home renovation, car repair, and software. It's got nothing to do with the actual technology and everything to do with organizing the group of people doing it.

It can work to realize and understand some fundamentals of how we are doing any of these things:

where am I in building this?
what am I trying to accomplish today?
what are the dependencies that I need to get that done?
what is preventing me to getting this done?

Scrum the Bad

Scrum can work, but too often its miss-used; and becomes a tracker of time sheets and velocity at the expense of the product the team is actually making. Care about adding value to the product, as no customer is paying the company for time-sheets or velocity. These metrics just don't matter in the end.

Dependencies

For someone managing this process, those last two points are everything. If some piece of work is dependent on other items getting done, the job is to make sure those dependencies are in order and getting done; and if those items are blocked in any way.
In our home reno project; a plumber may be ready to install the bathtub, but to install it the materials need to be there(pipes and a bathtub). So the buying the tub and transporting from the store to the house is a dependency.
For software, its could be a button on the UI that submits a form; you could have your UI developer create that, but if the server API doesn't exist and the logic isn't in place to save the value in the database; then that UI developer is blocked. Ideally these dependencies are in place by the time that front end work starts.

So if Scrum is a general way to organize people, then it needs to be used in conjunction with a methodology that actually builds the technology. These are when terms like XP, FDD, Spiral come in. So, when talking about Scrum, it's really Scrum/XP or Scrum/FDD, or Scrum/RandomConversation. Only with that combination you can have a chance of starting and stopping a project; Scrum helps you organize the participants of a project; but you need a definition of starting and ending. Is Random Conversation a process? Not officially, but many organizations use it!

Product or Project?

Firstly, what is the context of the effort? is it a one-off project for a customer, or a scheduled release that adds to a product? For a one off project you have constraints of time, since you are getting paid based on the time spent and when the project is over there is no more development.
For a product, the constraints are the features in the product. So the measure of success is the amount of value you add to the product, and how it sets you up for the next release. For the sake of analogy; its more like continuously growing a garden that produces vegetables than it is getting one plant growing and produce one vegetable that you give to the person asking for it.

Product

increased importance of maintainability since you have to live with this code for the long term. You have the opportunity to get it right over time over a number of releases, but you have to deliver enough value so the product can be used today.
High need for innovation as your effort is trying to add value to a product that your competitors don't have. What is the differentiator in what you are making compared to the competition?

Project

This is something that is budgeted with money, so it's a time and materials type of effort. The client is really just wanting a working tool that solves a single problem; so the need for innovation and invention generally isn't part of this. Its important to use established technology and process so that the project has a measure of predictability. This is essential. If the project ends with the technology unfinished, who is going to pay for the remaining time to complete it?

you have one chance to get it right, so there is an increased importance in requirements and design, and only building what you have time to do.

How do to this?

Product

If you are making a product that has no customers, then you are adding features to get to a first version that you can release and get on the road of paying customers. This is where the concepts of MVP and Definition of Done apply. What is the simplest version of the product that fulfills the users goals and solves a real problem? For this, using Scrum/FDD or Scrum/XP has been very effective to constantly iterating to get you to that first release.

What happens after that first release? You need to keep iterating on the product to increase its value for the next release, but you also have customers using it, so you need to respond to them and fix their issues (or not).
For this, split your effort accordingly.

Continue to use the Scrum/whatevermethodology you used to get there, and use the same sprint cadence with people dedicated to that process. This work can be planned, you are in control of how much you take on, how much innovation is attempted and what you are happy with to release.
Use Kanban to organize the unplanned work. As issues come in, have someone dedicated to managing this so that the customer gets a response, and the learnings from that make it as inputs to the next sprint. It may be best to have someone dedicated to support during each sprint so the other members aren't interrupted; but rotate this position as it becomes a bit of a grind)

DO NOT combined planned and unplanned work in the same process; you will end up doing neither well.

Project

You are making technology to a specification, and that specification needs to be understood and agreed to by the customer and the builders. Getting this correct will make or break your project; so take time to get this right, and get it done. Using iterations during this phase should involve the customer and builder directly, so it's intensive in a people participation aspect, but vital that the expectations and assumptions aren't left to interpretation at the end.

Depending on the project, a number of methodologies are useful here. Waterfall can work here, and the software may be part of a larger effort, so it's important to use a similar overall methodology for the whole project. This is also about predictability; so invention isn't really welcome here, and using established technology is necessary. This may seem less fun at first, but going over time and having to work for free to fulfill the expectations that were agreed upon at the start is a whole lot less fun.

Wrap it up

For Projects

Use established technologies (even if they feel old-school, the client probably isn't paying for innovation)
Invest time in getting the specification correct, and set expectations for further changes. (Jobs to be Done, and Domain modelling)
Aggressively remove work if its not needed
Identify and resolve dependencies and understand whats needed

For Products

Use Scrum with a methodology to plan and execute feature sprints. (Personas, Jobs to be Done, Quality Model)
Use Kanban to organize unplanned work so feature work isn't interrupted and customer requests are addressed.
Review each sprint to ensure each release is better than the last.
Innovate and add value so that the product is needed by its users. This is a process that lasts as long as your product is successful; so foster a culture of continuous improvement to get better over time.

It can be done! Projects are similar regardless of the technology; the client needs something to spec since the time and money is finite. So if its a home renovation or a website; use established tech and keep expectations realistic. For projects, you need to continuously add value while being responsive. Doing either well isn't easy, but it can be done by sticking to some sound fundamentals.

Monday, January 21, 2019

Feature Definition with Architecture Quality Attributes

Defining all that goes into a feature can sometimes stop at the functionality and not take into account the various system attributes that combine to make the feature actually successful for the user. Attributes like Usability, Performance and Security add to or take away from the user experience; so understanding them as a whole is essential.

This post outlines the usage of a Quality Model that takes into account the functional and cross-functional features to provide a template that can be used as a definition of done.

Definition of Done

The quality model is really useful for understanding the scope of what needs to be done for a single feature or a theme of many features together. The application you are building is just a collection of features that work together, and how they work together is what is ultimately how the user experiences the set of features. How fast, how secure, how easy they are to use is what is key. these attributes are sometimes referred to as 'cross-functional' as they affect all functionality in the system. Also the term 'non-functional' is sometimes used, but seems like a miss-leading term. They all do something; and it's hard to convey the importance of working on something that is 'non-functional'.

Where are these defined in working?

These constraints are sometimes included in issues or explained in a specification. It's important to be consistent, and treat these as the requirements they are. So, to that end, use acceptance criteria from the domain requirements to outline the compliance targets for the quality requirements.

This is applicable from the large to the small; from Product Architecture to Application Architecture to designing a single Feature. This template can be used to fix a bug, define a feature, or an entire product. These constraints provide the definition of done, or feature completeness.

Are you ever done the product? Hopefully not! The quality model give you a high-level glimpse of what it does today. This combines the attributes of the user facing functionality with everything around it; so it provides a much clearer picture of what is required to get that feature to a done state.

in its simplest version, its the basic release of any software.
Functionality: By Theme, use domain research
Usability: Goals and Experience
Performance: general feel and response times
Portability: Dev, staging, production
Serviceability: Observable logging, execptions, query times, dependency health and status.

Functional

Business/ Human facing attributes to satisfy the business rules/problem domain.
Match Personas/Jobs to be done with the Features that get their job done.
Domain requirements from subject matter expert
The Business value attributes that satisfy the business rules/problem domain. A set of attributes that bear on the existence of a set of functions and their specified properties. The functions are those that satisfy stated or implied needs.
Suitability

Are there relevant business requirements?

Accuracy

Is the data shown to the user accurate?

Compliance

Are there rules/regulations constraining the functionality? are these known and declared?

Usable

Is the system usable without a lot of training and previous knowledge? How intuitive are the controls and workflow? How does the User Interface 'look and feel'? This is the domain of user experience and has

How does the app match the job workflow? Is it similar and intuitive as to what I would do?
get jtbd matching key problem. set requirements, fulfil solution

This is the 'user interface', the 'look and feel'. For print and publications, this is called the 'design'. In software, this is the 'aesthetic design' and is really the tip of the iceberg. You only see about 15-20% of a software system, but its a very important part. Its has to be a comfortable place to get things done efficiently.

Survey users

UI/UX, HCI research (human computer interaction)

Accessibility - the application is a web application, so it needs to be accessible from common browsers and not just desktop browsers, but tablets and phones.
How intuitive are the controls and workflow?
Is the system usable without a lot of training and previous knowledge?
How does the User Interface 'look and feel'?
Usability Compliance

Accessibility requirements could be legally required.

Configurable

Configuration for: System (Dependencies), Product (Features), User (Permissions)

properties for environment - networking, logging, dependencies
properties for domain - domain rules, constraints
properties for product - feature flags

Permissions:

permissions seem to be the same as feature flags, but they are actually dependent on them. Permissions depend on Role in the system. Features depend on the product. Enabling a feature is decided before determining if a particular user has access

Performance

This is a measure of how responsive a system is. is it fast? and what does that mean? Do thing take a long time to load, or do some searches take a long time to run? When making a system that does a lot of things, its an equally or larger challenge to make the system do those things in a timely manner.
instrument the system, know how long what takes. ask during usability reviews

Does it use acceptable amount of memory and does the use of the memory scale linearly with more usage?
Set targets to start

page loads in 3 second response
data ingestion speed

high latent sources: 3 days
medium latent sources: 4 hours
low latent sources: 10 seconds

Security

If a user is going to hand over their data, whether its their personal details or business information they don't want to have to worry about the safety of that data. They don't want their account hacked, or have issues where their confidence in the systems integrity comes into question. Its comes down to Trust
define a security policy, implement threat mitigation, audit system regularly.

Review security aspects of new features, review operational security issues and review results of security testing for Authentication (Identity) and Authorization (Roles and Permissions)
No unauthenticated access to client specific data
Questions:

Is there private user data in the system?
How can the different roles access the system?
Is it deployed in a secure environment? (Asses the network security and use of standardized technologies)
Is the data persisted to a secure environment?
Is there any secure data in logs or other outputs?

Maintainable

To be able to work on the system with others the code and artifacts need to versioned and the operator needs to be able to quickly diagnose issues. The code base needs to be maintainable to add features without breaking existing functionality
Source code should be written in a manner that is consistent, readable, simple in design, and easy to debug. A set of attributes that bear on the effort needed to make specified modifications.

Can a user other than the developer run the system? Are the tools to do that useable and coherent?

Source code should be written to facilitate test-ability.

Does the implementation follow the intention of the design?

is there a design?
The design of reusable components is encouraged. Component reuse can eliminate redundant development and test activities (i.e. reduce costs).

Does the code comply to a code standard?
What amount of the system can be verified with tests?

How many of those tests pass?

Extensible

Requirements are expected to evolve over the life of a product. Thus, a system should be developed in an extensible manner (i.e. perturbations in requirements may be managed through local extensions rather than wholesale modifications).
Can you change the functionality for future use? Is there a common coding standard
Can similar functionality be implemented with the same component?

Observability/Serviceability

While the system is running, are issues easy to diagnose? If something goes wrong, will someone know immediately? How is it monitored?
Can a user other than the developer run and configure the system?

Are the tools to do that usable and coherent?

Can you change the functionality without rebuilding the application, and can you add to it later
Observability capabilities should include:

on-demand query of all systems go
real time alert of subsystem failure
ability to see errors, warnings and info messages

Availability/Reliability

The system needs to be able to run for long periods of time without degradation. Memory usage and resource allocations need to be sustainable and system loads predictable.
Reliability Compliance

Is there a declared/contracted availability?

Does it run and not degrade over time?

Deployment Execution.

How do you know when deployment is done?
Are the steps to deploy clearly explained and documented?

Portability

The system needs to adhere to standards so that it will be able to run on publicly available 'clouds' without modification. There are many candidates and an agile project cannot be locked into a single vendor relationship.
Source code should be portable (i.e. not compiler or linker dependent). A set of attributes that bear on the ability of software to be transferred from one environment to another.

continuous integration and automated deployment help here

Will the application run on other environments? Other operating systems and networks? Have a developer version and testing version with recent data.
Does the application run on the target environment? Does it only work on the development environment? Is there a process for incident and change management? (ITIL standards are a good start here)

Interoperability

Does it play well with other systems?

using standard protocols?

Can the system survive with its dependencies in a bad state?
Are all the dependencies identified? (Assess network connections and other dependency couplings)

Efficiency/Scale-ability

The relationship between the level of performance of the software and the amount of resources used, under stated conditions.
Scaling. Does it use acceptable amount of memory and does the use of the memory scale linearly with more usage?
Can the application share resources, or does it need its own machine/cluster?
Time Behavior, Resource Utilization, Resource Allocation
Under what conditions does the application leak memory? hog cpu cycles?
Efficiency Compliance

What is the current performance benchmark? what's the next target?

Application Architecture in the Large

The System satisfies the requirements through its various features. Use the quality model for a specific feature, or a collection of features within a product.

Understand technical debt from a high level. It's quality debt, but like the financials debt can be useful to get you to the goal; it just needs to be managed effectively.

Integrating in the process
All these points encompasses Quality Standards that need to be validated by Quality Engineering

Sign-off
For each quality requirement listed below, get sign-off from owner as part of the architecture review. An understanding of the current benchmark and achievable targets will properly frame expectations.

Functionality - Product Manager using backlog stories/acceptance criteria
Security - Sec Audit using security guidelines/standards
Usability - UI/UX - using usability standards
Performance - Performance Engineering using reliability/availability standards
Maintainability - Tech Lead using Code Guidelines/Standards
Portability - Operations using deploy and configuration Standards
Manageability - Operations through monitoring standards
Planning - Program Manager - using ADM/Scrum standards
Serviceability - Support using support standards

Wrap it up

You need to have some level of design to be successful for a system of any size, but too much architecture can slow down, or stop the implementation. Use consistent models and views that take into account the entire system; not just the UI and some functional stories.

In the end, the design will happen anyway. If it isn't formalized somehow (at least in some words and a diagram), then it's in the heads of whoever wrote the code. I have made some fairly large systems this way; but paid for it in the complexity of conveying that design to more people than myself. In the end the complexity of that mental model overwhelmed me. By using a simple template it becomes easy to account for what is going into the system, and much easier to share as a result.

Start with a lean definition of this for your needs; you don't have to re-invent the wheel here, just use the template.