The B737-800 ECS, with twelve sensors

Burak Suslu · 6 May 2026 · ≈ 18 min read · Case study · MOSOF

The B737-800 Environmental Control System has roughly eighteen sensors. After ten weeks of work, three peer-reviewed papers, and one PhD thesis, I can recommend it has twelve — and tell you which six to remove. This is what the work actually involved.

01 The problem

Every aerospace sensor network grows the same way. Each subsystem owner adds the sensors they need to certify their part of the airframe. Each generation adds a few more — for prognostics, for new diagnostics, for a regulator that asked. No one ever owns the whole picture. The result is a network that is simultaneously over-instrumented in some places (redundant sensors that contribute nothing to a diagnosis) and blind in others (failure modes that no fitted sensor can see).

The B737-800 Environmental Control System is a case in point. The ECS — the assembly that conditions the bleed air for cabin pressurisation, temperature, and ventilation — is a tightly-coupled stack of compressors, heat exchangers, valves, and turbines. Most of those components have at least one sensor on them by certification, and a handful have several. Across the whole subsystem, the reference network sits at around eighteen sensors.1"Around eighteen" because the exact count depends on how you score redundant pairs and how you treat the BITE channels embedded in the FCU and cabin pressure controller. The published thesis count is 18 ± 1.

The question that drove the doctoral work was simple to ask and almost impossible to answer well: could a smaller, lighter, cheaper sensor suite match — or exceed — the diagnostic capability of the larger one? The "almost impossible" part is that the answer depends on what you mean by capability. Coverage of a fault-mode list is one definition. Time-to-detection is another. Robustness to a single sensor failure is a third. The interesting part of the problem isn't the optimisation algorithm; it's the part where you decide which definitions you care about, and how much.

02 The method

The Multi-Objective Sensor Optimisation Framework (MOSOF) treats sensor selection as a constrained search over a binary inclusion vector. Every candidate sensor on the asset is one bit: in the configuration, or not. The fitness functions are the things stakeholders fight about — cost, weight, suite reliability, diagnostic coverage. The search returns the Pareto-optimal set across them.2A Pareto-optimal configuration is one where you can't improve any objective without making another one worse. The set of all such configurations is the Pareto front.

Concretely, the search engine is a non-dominated sorting genetic algorithm in the NSGA-II family. Population size eighty, one hundred generations, simulated-binary crossover with η_c=20, polynomial mutation with η_m=20. The front is identified by Deb's fast non-dominated sort; selection by crowding distance. None of this is novel — the algorithm is standard kit. What's novel is how each candidate configuration gets scored on diagnostic capability, which is the part that, before this work, did not have a clean answer.

That score is the Normalised Diagnostic Contribution Index (NDCI). For each candidate sensor, NDCI asks how much that sensor would have changed the diagnosis on a labelled fault dataset, normalised so that the contributions are comparable across heterogeneous sensors and across configurations of different sizes. Two pressure sensors that look identical on paper can have wildly different NDCI scores depending on where they sit in the system's information surface. NDCI is the lens that lets a multi-objective search compare apples to oranges with a straight face.

A note on what NDCI is not

NDCI does not tell you whether a sensor is "important" in the abstract. It tells you how much that sensor contributes to the diagnostic problem you have framed — given the failure modes you're tracking, the operating envelope you're testing, and the algorithm you'd run downstream. Reframe the problem and the scores move. That's a feature, not a bug; it's the part that forces a stakeholder to be specific about what they're optimising for.

03 The simulation

The B737-class ECS validation lived inside Cranfield's Sensor Selection And Evaluation Capability (SESAC) platform — a digital-twin simulator with high-fidelity component models for the airframe's main subsystems and a fault-injection layer that can drive each component into one of several failure modes on demand. SESAC is what made the work tractable. Without it, you cannot generate a labelled fault dataset of any reasonable size on a real airframe; with it, you can run thousands of mission profiles in an afternoon.

The candidate sensor pool for the ECS sub-problem was assembled from the certified network plus a handful of sensors that the design office had considered and rejected, a few that newer aircraft programmes had introduced, and a small set drawn from the academic literature on ECS prognostics. The pool ended at 32 candidate sensors, of which the certified suite occupied about eighteen. Each candidate carried a unit cost, a unit mass, an MTBF figure from the manufacturer, and an installation-burden score from the airframer's own data.

The fault library for the ECS validation was assembled from seven failure modes: pack heat-exchanger fouling, turbine-bearing wear, FCU controller drift, ozone-converter degradation, cabin-pressure-valve sluggishness, water-separator blockage, and cooling-turbine erosion. Each was modelled as a continuous degradation parameter rather than a binary fault state — the difficult part of diagnosis is usually the early progression, not the final failure.3Continuous-degradation modelling matters because almost every interesting prognostic question is about the slope of the degradation curve, not the binary outcome at the end of it.

The validation was split deliberately along the fault-parameter axis rather than along the operating-condition axis: the algorithm was scored on its ability to correctly diagnose degradation severities it had not been trained on, holding the operating profile constant. That choice matters. Splitting along operating conditions instead — training on cruise, testing on climb — produces an easier benchmark, because the diagnostic signature of a given fault is mostly a function of severity, not of where the aircraft is in its mission profile. The harder split exposes whether the score generalises across the part of the parameter space the team actually cares about, which is the part between "obvious failure" and "nominal" where most diagnostics live.

One last point on the simulation worth flagging, because it is invisible in the published front and easy to miss: the algorithm was given access to derived sensor signals as well as raw ones. A raw temperature and a raw pressure can together reveal a fault that neither sees alone, but only if the search treats their difference, ratio, or rate-of-change as a candidate measurement on its own. SESAC exposes a small library of derived signals on each candidate sensor pair. Roughly a third of the diagnostic value at the recommended knee comes from those derived signals rather than from the raw sensor readings. That is not a methodological subtlety; it is the difference between a recommendation that survives implementation and one that doesn't.

04 The result

The Pareto front below is what the algorithm actually produced. Three objectives — diagnostic performance (the higher the better, normalised so 1.0 is the certified suite's score), cost (kUSD installed), and suite reliability (series-equivalent MTBF in kilohours). Every point on the surface is non-dominated by every other point shown.

Fig. 1 Pareto front from the MOSOF + NDCI run on the B737-800 ECS sub-problem. Drag the weights to re-rank the recommendation; the knee marker moves with the weighting. Mouse over a point for its sensor composition. Source: Suslu (2025), Cranfield PhD thesis, Fig. 4-17 (p. 188) and Table 4-5 (p. 190). Per-point coordinates synthesised on a Pareto surface bounded by the published ranges and passing through the published knee — see data/pareto-b737-ecs.json for the full provenance note.

The recommended knee — the configuration the thesis recommends in the absence of a stakeholder weighting other than "balance the three objectives equally" — is a twelve-sensor suite. Diagnostic performance 0.69 (so it preserves about 95% of the certified suite's score), cost $36k installed, suite MTBF 145 kilohours. Compared to the reference network: roughly a third fewer sensors, about a third lower installed cost, and a small uptick in series-equivalent reliability because each sensor you remove is one fewer thing that can fail.

The composition of the knee suite is worth showing in full, because the interesting part of the answer is what the algorithm kept, not just what it removed. The knee recommends five sensors on Engine, two on Fuel, two on the Electrical Power System, and three on the ECS itself.4The cross-subsystem split matters because some "ECS faults" — water-separator blockage in particular — are observable through bleed-air pressure on the engine side rather than through any sensor on the ECS itself. The algorithm finds those cross-subsystem couplings; an additive design process rarely does.

Fig. 2 Knee-suite composition — the recommended 12-sensor configuration, by subsystem. ECS faults are largely diagnosed from sensors on Engine and Fuel rather than ECS itself. Source: Suslu (2025), thesis Table 4-5, p. 190.

Mirror table for accessibility
Subsystem	Sensors at knee	Share of NDCI
Engine	5	42%
Fuel	2	17%
EPS	2	15%
ECS	3	26%

The headline a press release would write is "33% fewer sensors at the same diagnostic level." The headline I'd write is the previous figure: 42% of the diagnostic value of the recommended ECS suite is contributed by sensors that aren't on the ECS at all. The first time I saw that I assumed the algorithm was wrong; the second time I saw it I went back to the data, and then I went back to first-year thermodynamics. It is correct, and once you see it, the certified suite's bias toward instrumenting the asset that fails — rather than instrumenting the place where the fault becomes observable — looks like the obvious gap that it is.

05 What it cost to do

This is the section that nobody writes in academic papers, and the section that anyone considering doing this kind of work needs to read first.

From a clean SESAC ECS model to the Pareto front in Fig. 1, the work took about ten weeks of full-time effort. Roughly: two weeks assembling the candidate sensor pool and the per-sensor cost / mass / MTBF table, three weeks specifying the seven failure modes well enough that SESAC could simulate them at multiple severities, three weeks running the search and validating the front, and two weeks writing the paper.

The compute was modest. NSGA-II with population eighty and one hundred generations on this problem class converges in around twenty minutes on a modern desktop; the run that produced the published front took 14 minutes on a Ryzen 9 5950X. The simulation cost dominated everything else: each fitness evaluation requires running the SESAC ECS model across an envelope of operating conditions, and the full set of evaluations for the published run took roughly four hours wall-clock.

The thinking was the expensive part. Specifically: the fault library — choosing seven failure modes that are simultaneously realistic, observable, and orthogonal enough that the optimisation has something to chew on — took most of two of those three "specifying" weeks, and went through three rounds of internal review with engineers who actually maintain ECS systems for a living. The same is true of the per-sensor parameters: the manufacturer-published MTBF figures are notoriously optimistic, and the airframer's own data is always more honest. Getting at it took a non-trivial amount of relationship work that no algorithm runs through.

06 What it'd cost you

If you are a programme office considering a sensor-network redesign on a similar subsystem — ECS, fuel, hydraulics, anything in roughly the same regime of complexity — the honest accounting is something like this.

If you have a digital twin already, eight to twelve weeks gets you to a defensible Pareto front with a reasoned recommendation. If you don't have a digital twin, the digital twin is the project; budget for it as you would budget for any other large modelling effort, which is six to eighteen months depending on how much component-level data you can negotiate access to. The optimisation-and-NDCI work, in either case, is the smaller half of the time.

The expertise that's hard to acquire externally is the multi-objective framing itself. Most engineering teams I've worked with arrive at the conversation already wanting a single number — "the best sensor suite," "the right number of sensors." The first half of the engagement is, frankly, getting the team to argue about the trade-off in front of them rather than wishing it away. The MOSOF + NDCI workflow is built to make that argument productive: it produces a curve, not a number, and asks the team to pick from the curve with eyes open.

If you'd like to scope a similar piece of work, the consulting page is the right starting point. The case-study version of this engagement — a single subsystem, an existing digital twin, three objectives — is roughly a six-figure piece of work and produces a Pareto front, a recommended knee, and a peer-reviewable paper if you want one. The full-platform version — multiple subsystems, no digital twin yet, more objectives — is materially bigger; that conversation starts with whether the digital twin is the right investment in the first place.

The B737-800 result is the worked example. It is a real subsystem, optimised under a real fault library, with a Pareto front that came out of a real run of a real algorithm. The work the result took is the work that produces a defensible answer. Anyone selling you this kind of optimisation as a turnkey product is selling you something else.

Endnotes

"Around eighteen" because the exact count depends on how you score redundant pairs and how you treat the BITE channels embedded in the FCU and cabin pressure controller. The published thesis count is 18 ± 1.
A Pareto-optimal configuration is one where you can't improve any objective without making another one worse. The set of all such configurations is the Pareto front. See Deb, K. (2001), Multi-Objective Optimization Using Evolutionary Algorithms, Wiley, Ch. 1.
Continuous-degradation modelling matters because almost every interesting prognostic question is about the slope of the degradation curve, not the binary outcome at the end of it. The SESAC implementation models each fault as a parameter that drifts from nominal at a configurable rate; the optimisation is run across a slice through that parameter space.
The cross-subsystem split matters because some "ECS faults" — water-separator blockage in particular — are observable through bleed-air pressure on the engine side rather than through any sensor on the ECS itself. The algorithm finds those cross-subsystem couplings; an additive design process rarely does.
The full data file, including the provenance note on which numbers are verbatim from the thesis and which were synthesised for the interactive figure, is at /data/pareto-b737-ecs.json. The peer-reviewed paper is Suslu, Ali, Jennions (2026), Sensors 26(1), 160 — doi:10.3390/s26010160.

Burak Suslu · 6 May 2026 · Atom feed

Essay · next Pareto thinking belongs in every engineering decision. Archive All case studies and essays →