How to Quantify Effort to Deploy a Product Integration

In a previous post, we talked about using a nine-box priority matrix to decide which product integrations to build and in what order. In short, you classify integrations along two axes: impact to the business and effort to deploy the integration. We also shard how to quantify one of those axes: impact to the business.

This time around, we'll cover the other axis--quantifying the effort required to deploy a product integration. This will give you a basis of comparison for different integration opportunities and the work that will go into them.

What Information to Gather

This exercise is about triage. You are only trying to estimate effort enough so to compare different integration opportunities. This is intended to help you comparatively prioritize, not to forecast actual cost or delivery dates. (That will come later.)

You need to strike a balance between enough detail to be useful and so much detail that it takes too long to do this analysis. The following bits of information are about the right level of detail to do this effectively.

You may have reason to include more than this. Every company is different and there may be things unique about your business that justify changing or expanding these criteria. This is not scientific, but always double check yourself. Make sure you really need to factor in more than what's below. Are you just convincing yourself? Does more information (and more time to gather/discuss) move the needle enough to matter?

You also may not always have this information for every integration opportunity. If that's the case, don't spin your wheels trying to get it. That scarcity is relevant. Lack of critical information should generally be something that pushes estimated effort up. Unknown is generally more complex (and expensive) than known.

Intended Data Flows

You'll need an idea of what data flows (sometimes called integration flows) are to be included in the overall integration.

Data flows represent a direction (from endpoint A to endpoint B) and a main entity or object type. Usually they are best described directly this way, but you can name them whatever is useful as long as a group of people with different backgrounds understands what the flow represents.

Note: Some integration software products implement the idea of data flows or integration flows, but conceptually no software is required. Just think, "If I were building a workflow in some arbitrary software product, what would it look like?"

Some examples:

Orders from Shopify to NetSuite
Customers from Salesforce to Marketo
Logs from Google Cloud to Snowflake

Again, your goal isn't to architect the entire thing. What you actually design and deploy will change later. For now, you just need a general understanding of what you think is required to accomodate the business use cases associated with that potential integration. Data flows are to effort as use cases are to impact.

Having this list, even an imperfect one, is important for a couple reasons: 1) you need to generally know what data flows are necessary so you can provide the next piece of information, complexity and 2) as a general rule, more data flows = more scope = more effort. Knowing that an integration's use cases can be addressed with two data flows versus eight for another integration is important.

Data Flow Complexity

Once you have that list of data flows, you need a general understanding of how complex each data flow will be.

This one is tough to nail, especially if you have little to no experience with one of the endpoint systems or you don't have enough domain expertise to know better. Again, "imperfect but useful" is the name of the game, and unknown influences effort upwardly.

Some things that typically make a data flow more complex include:

Having to make a bunch of API calls to assemble and/or save all the data you want to move from A to B
Complicated data transformations like converting an object into something fundamentally different (converting apples to oranges versus system A's apples to system B's apples)
Calculations, especially when money is involved--division and multiplication especially create rounding problems
High data volumes--building a flow to support 10 transactions per day is far easier than one that supports 10,000, which is easier than one that supports 1 million
A bidirectional data flow, where data must flow there and back in one session

It's unlikely that everyone has a full enough picture that they can individually make a complete assessment of complexity. That's why it will be important to decide on complexity, and ultimately overall effort, with a cross-functional group. (More on that later.)

Endpoint API Types

It's pretty easy to find out what type of API a given endpoint in the integration offers. This has an impact on overall effort to deploy the integration, because different kinds of endpoints require different kinds of work.

The following is a short list of commonly used API types:

REST (REpresentational State Transfer) -- This is extremely common!
SOAP (Simple Object Access Protocol)
GraphQL
RPC (Remote Procedure Call)

Just simply knowing which of these is used on both sides of the data flows helps to decide on effort. The impact of the API type should be factored in based on 1) your team's familiarity with APIs of that type and 2) whether or not there are different types on opposite ends of the integration.

Endpoint "Auth" Requirements

Authentication and authorization (who are you and what are you allowed to do?) is usually one of the hardest parts of building an integration. These concepts are a little advanced for someone non-technical anyway, but then every endpoint handles them differently, even when they implement a supposed standard.

Knowing what types of auth are available for given endpoint helps the team (primarily more technical members) gauge how hard it will be to auth with an API. Generally, older, non-standard, or heavy enterprise auth mechanisms are higher effort to integrate. However, this definitely depends on your team's specific skills and experience.

Documentation

The ability to deliver an integration is highly dependent on the availability of API documentation. Most APIs are not self-describing enough from someone go at it without docs, but the availability and quantity of those docs ranges from endpoint to endpoint.

On one side of the spectrum, well written API documentation combined with generated docs via OpenAPI (a standard for building self-describing APIs) is less effort to work with than a poorly, hand-written API doc--or worse no API documentation!

If you can answer the following questions about the documentation, it should be enough to know how well the API is documented:

Does the API have documentation?
Where is the API documentation located? Is it publicly accessible?
Was it built using Swagger/OpenAPI or RAML (standards approaches) or handkeyed?
How often is it inspected for accuracy and to account for updates?

End User Inputs

All of the above apply to any integration project. However, if you are building a product integration, you're taking on another level of complexity. Unlike a general integration project that gets implemented one time for one stakeholder, a product integration is used by many.

In other words, if you are endpoint system A building a product integration to endpoint system B, you need to build an integration that supports the needs for all of your customers who may want to integrate to system B--not just one customer who wants to do so in a very specific way.

This means your requirements are open ended to some extent, which has an impact on the project's required effort.

There are two primary drivers for the requirement that end customers provide input:

Data incompatibilities between the two endpoints that can only be resolved on a case by case basis (e.g. mapping associated tax codes across two systems)
Decision points that allow the integration to flex to more versions of the end customer's requirements

The first one is more directly understood, but it does require you to catch those incompatibilities in the integration design. Sometimes that happens later via trial and error, because if you aren't familiar with the API you're integrating to, it's hard to see these.

The second one is all about understanding a variety of your customers' needs. If you can build a product integration with input from three to five customers, instead of just one or none, you see variations in how they want to use the integration. This helps ensure you don't build something that doesn't work for the majority of customers who wanted that integration.

If it sounds a little esoteric, that's because it is. The key takeaway: more expected variability in a flow (requiring input from the end customer) generally means more effort.

Estimating Effort to Build an Integration

Now that you've collected a reasonable amount of information about the integration opportunity, you need to come up with a quantity for what the effort will be.

As with most estimation, this is as much art as it is science. The goal isn't to be perfect. The goal is to be informed and reasonable and to apply similar standards and practices across all of the integrations you estimate effort for.

Ways to Estimate Software Tasks/Projects

Before talking about the approach we advocate for estimating integration opportunities, it's worth understanding some of the common ways software teams estimate tasks and projects in general.

Any of these approaches are workable for estimating integrations. In case any one seems to suit you better than our recommended approach, we'll talk about all of their advantages and disadvantages.

Time-Based Estimation

The simplest way to estimate the effort required to build a piece of software, a feature, an integration, or really anything is to estimate how much time it will take to get it done. This is typically done in "person hours" or "person days" (how many hours or days for one person to complete a task).

This is really helpful for well-known, repeatable processes like how long it takes to assemble a part in a factory. It's also more accurate for small tasks. It's less helpful and less accurate when the output is not as well defined, like an art project or a piece of software that solves a problem in a new way.

You can get more accurate with time-based estimation, but it takes more effort to get there. Breaking down individual tasks and which roles handle them help you estimate based on specializations and sequencing of tasks. Grossly oversimplifying things, this is what traditional Project Management Institute (PMI) estimation and Gantt charts do.

Advantages: It's easy to understand a time-based estimation. If the estimate says 40 hours, there is no question how many hours the estimate represents (that sounds silly now, but read on). It's also quite easy to compare two tasks with a time-based estimate assigned. It's not hard to understand that the task with 40 hours estimated is half the effort of the one with 80 hours.

Disadvantages: People are notoriously bad at estimating time, and they are exponentially worse as the thing they are estimating gets bigger. For example, imagine estimating the number M&Ms in someone's hand versus filling a mason jar versus filling a dump truck. You'll statistically be much closer the smaller the overall population is. This applies to estimating time for small and large projects--the latter is very difficult for humans to do.

The other disadvantage is that coming up with time-based estimations that are more accurate takes even more time.

T-Shirt Size Estimation

Another straightforward way to estimate the size of a software task or project is using t-shirt sizes. This is far less precise, but it requires far less analysis or calculation. With this approach, it's really about gut feel, and it's really only useful for comparison of multiple tasks or projects. Consider using XS, S, M, L, XL, but you can expand with as many Xs as you feel you need.

Advantages: This is fast. This is easy. Everyone understands that the XL task is bigger than the M task.

Disadvantages: There is technically no quantity to these sizes. Is an X bigger than an XS and an M bigger than an S in a linear fashion? Just how big is an XL? Since there is no quantitative measurement, this is really only best for quick and dirty comparative estimation.

Agile Estimation

The emergence of Agile software development over the past couple decades introduced a different estimation approach that blends the advantages of time-based and t-shirt size estimation. It involves estimating tasks or projects based on an arbitrary scale that represents an abstraction of "work to be put in". It's intentionally vague about what the measurement means other than bigger number = more. However, the scale generally increases exponentially.

Most Agile estimation is done by assigning a point value to a task using the Fibonacci sequence: 1, 2, 3, 5, 8, 13, 21, and so on. A "1" is (relative to all the tasks to compare/estimate) should represent the lowest possible effort. These are typically smaller bites that are most completely understood. A "2" is roughly twice the effort of a one, and so on.

You land on the effort estimate for a given task or project by your team coming up with a reasonable, albeit imperfect consensus.

If everyone lands on a "2", then good! It's a "2".

If you have an three people saying "5" and one saying "8", then go with "5".

But if half your team says a "2" and half says a "13", you've got enough discrepancy to warrant some discussion (bonus: you probably just uncovered some miscommunication).

Advantages: Agile estimation is intentionally imperfect and intentionally vague, but it does provide a quantitative measurement that a fairly constant team can consistently apply.

It also respects the exponential aspect of estimation--that we are far less equipped to discern a "20" from a "21" than we are a "3" from a "2".

Furthermore, this is increasingly considered a best practice for estimating software projects where the output is unpredictable and often discovered over time.

Disadvantages: Trying to explain to someone who isn't educated about what a "point" means can be a tall task. Often the novice will immediately ask, "How many hours is a point?" at which point this method starts to lose some value.

This approach also requires some practice to get your head around doing it well. For most teams, learning to embrace the imperfection of it is hard.

Estimating Integration Projects

Now, let's get to the important part of all of this: how to use the information you gathered about each integration opportunity to assign a quantified effort estimate. In doing so, you can qualify them as high, medium, or low effort and can thus place them in the appropriate box in the nine-box framework.

For each integration opportunity, follow these steps:

Step 1: Collect Your Information

Take no more than half a day's worth of work to collect the information described earlier in this post as well as anything additional your team believes is important. Depending on your team's structure and who has what skills, you may need more than one person, each with different skill sets, to collect all the information.

Step 2: Assemble the Team

Product integrations are the space between Product Management, Software Engineering, Partnerships/Business Development, and Implementation/Support. To the extent possible, assemble a cross-functional team of 3 to 7 team members who can build consensus estimates.

Remember, if two people have very different perspectives on an effort estimate, especially if they have different areas of expertise, there's a valuable discussion to be had.

Step 3: Review the Information

Review the information brought to the table. Take some time to ask/answer questions, but leave the big, unanswerable ones unanswered. This isn't a design session. If there are unknowns, people should factor those into their effort estimation, because more unknown is likely to result in more effort exhausted, all said and done.

Step 4: Consensus Estimate Per Data Flow

Use Agile estimation to size the effort for each individual data flow in the integration. The team may not agree 100% on the number, so if you can meet in the middle, go for it. If you have large differences of opinion, discuss them and try to land on a consensus. When debating two numbers, assume "unknown" and go with the higher one.

Step 5: Add Up the Data Flow Estimates

Then to size the entire integration project, add up the estimates for each data flow. This gives you a quantitative effort score for the whole project.

Adding up estimates for individual data flows is important, because it allows your team to compare, for example, an integration with two very complex data flows with one that has nine simple ones.

Repeat

Ideally you spend no more than 30 minutes estimating a single integration project. If you get good at this, you probably get it down to less than 10.

This may seem like flying through it, but remember, you are only trying to get a reasonable guess for comparison purposes. The team delivering the integration will get more specific when that integration comes to the top of the queue.

It's helpful to do this in sessions where many integration projects are estimated at once. These sessions are sometimes called Planning Poker, named after a card-based "game" for getting to point-based consensus quickly. If you've done your homework, you may be able to estimate half a dozen to a dozen or more projects in a single session.

Why This Is Important

The more integration projects you are able to estimate and the more effectively you can estimate them improves your ability to build the right integrations in the right order.

Cross functionally understanding how big a project helps you avoid working on an integration that never gets finished (so, who cares how impactful it can be?). It also helps you identify ones that can deliver value fast, because they require little effort.

Having a consistent methodology for estimating effort is the only way to make those comparisons. Otherwise you are flying blind.

Software engineers, especially those who have worked on Agile teams before, are likely to be pretty comfortable with this process. But, integration is not just a software engineer's challenge, and Agile is not only for software development.

Learn to think in terms of Agile estimation, managed imperfection, and comparison based priority across all of the functions who have a stake in your integration strategy.

It gets pretty easy to do pretty quickly, once you get a few reps.