I've read the [PCA documentation](http://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html ) of scikit-learn.
[...] improve the predictive accuracy of the downstream estimators [...]
What is the definition of “downstream” in machine learning?
In IT, "downstream" refers to the transmission of data to an end user or toward an end user from a central server or point of origin. This is in contrast to upstream transmissions, which move from the end user to the central repository.
down·stream ˈdau̇n-ˈstrēm. : in the direction of or nearer to the mouth of a stream. floating downstream. located two miles downstream. : in or toward the latter stages of a usually industrial process or the stages (such as marketing) after manufacture.
In the context of self-supervised learning (which is also used in NLP), a downstream task is the task that you actually want to solve. This definition makes sense if you're familiar with transfer learning or self-supervised learning, which are also used for NLP.
Stream – The moving water in a river is called a stream. Upstream – If the boat is flowing in the opposite direction to the stream, it is called upstream. In this case, the net speed of the boat is called the upstream speed. Downstream – If the boat is flowing along the direction of the stream, it is called downstream.
I know the term "downstream" from neural networks. In those machine learning algorithms, you have so called "neurons" which are usually in form of a DAG. Downstream is everything after a certain neuron. You say neuron y is downstream of neuron x if and only if there is a directed path from x to y.
In a more general setting, I can only guess: y is downstream of x if and only if y uses data processed by x.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With