In this system we can’t, by design.

So even if we collect the data of the 3 flows, it doesn’t matter, it’s a redundant information.

One flow information would be enough.

The rest can be deduced by proportionality/correlation.

Only when we use an independent valve on each pipe, we can say that the number of dimension is equal to the number of features.

But then, we need to try all the different combinations between all the states of all the valves on all the pipes, before we can say that we “covered” the whole exploration space.

What matters is how many independent knobs we have.

If we consider only 2 states per valve: totally open and totally closed.

with each additional valve added to the system, we have 2 times more combinations to explore.

For instance, with 3 valve we get the following 8 combinations:[closed, closed, closed][closed,closed,opened][closed, opened, closed][closed, opened, opened][opened, closed, closed][opened,closed,opened][opened, opened, closed][opened, opened, opened]Each additional pipe, with an additional valve (Or knob) will add an additional dimension to the space, and double the data needed.

We would need exponentially more data to cover all the possible combinations.

That’s a well-known problem in data analytics called “The curse of dimensionality”.

Too many variables!I was going to write about the mathematical tools that allow to discover the number of independent dimensions from the different features (like Principal Component Analysis PCA).

But I found this excellent blog post about the topic.

Originally published at DataThings on March 6, 2019.

.