Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

tf sumpooling layer 1d vs 2d

I am currently working on a paper by Sturm et al. (2016) published in the Journal of Neuroscience trying to replicate their results using python and TensorFlow, Keras libraries.

I have strong doubts about whether if I have understood the way they designed the model as explained in section 2.1.

I couldn't fully understand the following points because of my lack of experience in the field.

  1. Did they use 1d or 2d sumpooling layers?
  2. What were the exact output shapes after each layer?
  3. Did they use a categorical format for output?
  4. Did they use dropout and any other layers?

How would you go about designing the described model?

Thank you in advance for your valuable comments.

like image 306
Leon Rai Avatar asked May 26 '26 15:05

Leon Rai


1 Answers

Here's my take on this.

Dataset used: Here and Dataset from this paper

Did they use 1d or 2d sumpooling layers?

Though they do not specifically say what they used in the model, it should be 1D pooling. Because an EEG signal has only two dimensions (time, channels) which is the type of inputs accepted by 1D pooling layers. Furthermore, the paper says (section 2.1),

The  first  linear  layer  accepts  an input  of  the  dimensionality  
301  time  points × 118  channels EEG features

But then they say the following, which is a bit odd.

Each epoch’s spatio-temporal features(301  time  points×118  channels  for  aa-ay,  
301  time  point×58  channels  for  subject  od-obx)  were  vectorized  
into  one vector with 33518 (17458) dimensions.

I'm guessing they added a single dimension after this vectorization so that a single input has 2 dimensions. Otherwise, pooling cannot be performed. Also it should be 35518 (301x118) not 33518.

What were the exact output shapes after each layer?

There seems to be two DNN networks (one for the first dataset and another for the other). This is not I'm entirely sure about. But this is a need since the two datasets have different input sizes.

(None, 301, 118) (None, 301, 58)
       |                |
       V                V
   Flatten()         Flatten()
       |                |
       V                V
(None, 35518, 1) (None, 17458, 1)
       |                |
       V                V
  Sumpooling()      Sumpooling()
       |                |
       V                V
  (None, 500)       (None, 500)
       |                |
       V                V
      Tanh             Tanh
       |                |
       V                V
  (None, 500)      (None, 500)
       |                |
       V                V
   Softmax(2)       Softmax(2)    
       |                |
       V                V
   (None, 2)        (None, 2)    

Did they use a categorical format for output?

They are solving a birary classification problem. So yes, they will be using categorical format. It would be something like,

If label 0 => [1 0]
If label 1 => [0 1]

They could also have a final layer with one Sigmoid(1) and have labels as they are (i.e. a scalar - 0 / 1).

Did they use dropout and any other layers?

It's not really mentioned they used dropout in this paper. And to be honest, there doesn't seem to be proper place to use dropout either. The only place would be after tanh. But since the network is not that complex, probably authors didn't feel the need.

like image 117
thushv89 Avatar answered Jun 01 '26 12:06

thushv89



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!