Despite having “mountains” of data, the truth remains that often we can’t directly measure what we really want to.
In fact, smart people can often put several balanced and convincing arguments to us about why getting that measured data is impossible.
It can be tempting to accept that but it is nagging though, especially when you want to improve.
But is there another way?
Here is (an apparently true) story that will hopefully help inspire your “can-do” thinking if you are stuck.
I found it an interesting and hopefully you do too.
The Challenge Of Armouring A Fighter Plane
So the story goes that how to armour this plane was a real question the Navy was considering during WWII.
The challenge was knowing how to better protect the planes so they didn’t get shot down.
The issue was knowing precisely what happened. That was because, even when they could get to the planes, assessing the damage was hard. The planes that got shot down were so badly damaged that any analysis of the wreckage was futile.
So what to do?
Obvious Solutions No Good
One solution would be belts and braces and simply armour the whole plane.
The issue with this is that not only is it expensive but likely to be very heavy. Like racing cars, planes need to be lightweight so this was not a viable option.
What they really wanted was to place the armour just where it was required and no more. But where?
For these guys the issue was more than a nagging problem, it was literally a matter of life and death.
I’ve been there like you may have been. Not in the military but stuck like that.
Stuck trying to get my head round a seemingly impossible measurement challenge.
It is the situation where you have worked out what you would really like to know but there seems no way of easily measuring it – maybe say like understeer vs oversteer balance.
It can be immensely frustrating but next time you are in that “impossible” situation, don’t give up.
The way the Navy solved their problem is the same way you can – by using what I call I’m the “inferred metric“
Here is a checklist of 7 types (with examples) for you to work through next time you are stuck trying to measure the impossible.
The Inferred Metric
The “inferred metric” is an approach to get data on something you can’t measure directly (like where those planes were getting shot down) by taking what you can measure and inferring what you want to know from that.
There are many different types of inferred metrics.
Here are 7 (with examples) for you to consider, including the one the Navy used to work out where to armour their plane:
7 Types Of Inferred Metric To Consider In Your Problem Solving
1. Proportional change
This can be something moving up or down in proportion to what we really want to know.
In athletic sport, we might want to know our blood lactate levels. To measure lactate requires a blood sample. Sadly this can be impossible for many reasons (i.e. lack of equipment at the venue, moving too much to take sample etc.)
What we can measure quite easily is heart rate. Whilst, not an ideal match there is a strong relationship between heart rate and lactate levels. Therefore by measuring our heart rate we can infer what our lactate levels might be.
2. Experience of others
By seeing what others have done in a similar situation, we can sometimes infer what someone else would do too.
A good example would be online services like Spotify or Netflix or Amazon where they watching what other people do (who they think are similar to us) and then suggest something for us that we might also like to try/do/buy.
That’s at a mass scale. It can also be relevant at a smaller scale.
An example could be in motorsports. If you’re trying to set up the racing car to have balance (balanced cars tend to go faster) then you can send out two or three different drivers. If they all come back with the same comments then you can infer what you need to do to make the car more balanced.
3. Controlled Experiment
Measuring what you want directly but not in a representative environment.
For example, going back to our lactate verse heart rate. If you run on a treadmill in a laboratory, we can control the exact speed you run and the duration. You are also right their (and not 3 miles from the testing equipment).
We can get you to run at different speeds, taking your lactate periodically, and measure your heart rate at the same time.
A treadmill is not representative of your real running environment but it is close and enables us to get direct measurements.
When you’re back out running on the road you can then infer your lactate from your heart rate, from this lab data.
A simulation is like a special type of controlled experiment. They can take many forms but the main two are a statistical and physical simulation.
The “data science” community and all their machine learning-ai-magic are typically statistical simulations.
They use (really) clever maths to determine something of value (i.e. a preference or best decision) from actual or derived relationships between things.
My background is in physical simulation (vehicle dynamics), where we are using our understanding of physics and materials to recreate something on a computer we can play with as if it were in real life.
With both types, what you do is “poke and prod” the simulation model in a realistic way, to see what would happen in real life. For example, try the effect of different spring or damper rates on lap time.
You then infer from that what you think would happen in real life.
Simulations are always inaccurate but can still be useful.
5. Signal in the noise
Where you extract an underlying trend to infer something more valuable.
The obvious example here is acceleration traces in motorsports. So they are up and down every part of a second. What you may want to know is how they are more generally changing or not overtime.
By applying a signal filter to your traces you can infer what they are doing and make more reliable data analysis.
There is a whole subject field on “signal processing” and “frequency analysis” largely focused on the world of music and sound engineering. It is relevant to motorsports data analysis too and can help a great deal in data analysis – especially with acceleration data.
6. Environmental context
Where there is a change in your environmental conditions, you can infer (how that will affect) what you’re interested in.
Context is a strangely complicated thing to define – as it is linked at a granular level to your specific circumstances.
However, typical environmental context could include:
- The weather (i.e. “cold”, “hot”, sunny)
- Seasonality (i.e. Christmas, Mondays, mornings)
- Health (i.e. illness, pregnancy, fitness)
- Events (i.e. local events, national, international)
- Situation (i.e. indoors/outdoors, home/office, travelling, holiday…)
The thought process is: “Given this context, what can we infer?“
In racing this largely weather-related, such as changes in track temperature, but may also be track specific – like how hilly it is or whether there has been recent track resurfacing.
Finding the right context is often really challenging.
Data people talk about the importance of “domain knowledge” and this is why.
7. What is missing (The Navy’s Approach)
Finally, this is what that Navy did to solve their issue.
This inferred metric is where data that could be there isn’t.
You can then use your knowledge to infer what that means.
Have a look at this picture of that plane the Navy were looking to armour.
Each dot is a bullet hole recorded where a planes was shot.
Simple. Amour the planes where it is getting shot and you are done right?
This data wasn’t from the planes that got shot down, it was from the planes that made it back.
That is critical context.
If you haven’t worked it out already have another look at that picture.
Where are you going to armour your planes?
Clearly your going to put the armour where the bullet holes ARE NOT.
You are inferring that by the fact that the planes that made it back could have bullet holes in all these areas.
In fact, you have also inferred that where the planes actually needed the armour is where there aren’t any bullet holes.
You have concluded that you’d actually recommend putting armour where there there is no evidence of bullet holes!
Just think about how that would sound…
But once you have that data and context the explanation is straight forward.
I’ll be honest, I did not see the bullet hole connection at first. As soon as it was pointed out I cannot NOT see it!
Such is the way with these things.
A Sanity Checklist
The inferred metrics are great but, because it’s not direct measure, you really should do one final step to put yours (and your bosses!) minds at rest.
Does this make sense? It is worth just doing a quick check when you have eureka moments.
Look again at the plane picture.
1- Do you have enough data?
You could always do with more data, but in this case, the coverage is consistent enough across the whole plane to draw at least broad conclusions.
Yes, you have enough data. Tick 🙂
2- What is in the areas which have no bullet holes? Are these important for flight?
First the two easy ones:
- Engines? Tick 🙂
- Cockpit? Tick 🙂
Now two more that require (some minimal) domain knowledge:
- Rear fuselage? – this holds the tail on. Tail is required. Tick 🙂
- Mid wings? – this area holds the fuel tanks. Fuel is needed. It is also explosive. Tick 🙂 Tick 🙂
3- Does this pass the sanity check then?
When you can’t measure something directly it can feel impossible. Don’t give up.
Use this checklist to help you consider which type of inferred metrics might be able to help you out.
You’ll be making assumptions taking this route. Just be aware of that and be straight with the people you’re working with – let them know that this is imprecise but potentially valuable.
In my experience, imprecise is often better than nothing at all.
See where you can take it.