Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pd.concat throws a ValueError: all the input array dimensions except for the concatenation axis must match

I've got an error merging two dataframes by row. The last version I used pd.concat([df1, df2], axis=0), but in pandas version 2.1.0 doesn't work. Anybody knows how to solve the error?

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[47], line 2
      1 print(real_last.shape, real_exp.shape) #(59202, 34) (4583, 34)
----> 2 real_out = pd.concat([real_exp, real_last], axis=0)
      3 print(real_out.shape)

File c:\Users\sarud\anaconda3\envs\ETLupdate\Lib\site-packages\pandas\core\reshape\concat.py:393, in concat(objs, axis, join, ignore_index, keys, levels, names, verify_integrity, sort, copy)
    378     copy = False
    380 op = _Concatenator(
    381     objs,
    382     axis=axis,
   (...)
    390     sort=sort,
    391 )
--> 393 return op.get_result()

File c:\Users\sarud\anaconda3\envs\ETLupdate\Lib\site-packages\pandas\core\reshape\concat.py:680, in _Concatenator.get_result(self)
    676             indexers[ax] = obj_labels.get_indexer(new_labels)
    678     mgrs_indexers.append((obj._mgr, indexers))
...
--> 230 return super()._concat_same_type(to_concat, axis=axis)

File arrays.pyx:190, in pandas._libs.arrays.NDArrayBacked._concat_same_type()

ValueError: all the input array dimensions except for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 4583 and the array at index 1 has size 59202

I have the packages: print(sys.version, pd.__version__, np.__version__, sep='\n')

3.11.5 | packaged by Anaconda, Inc. | (main, Sep 11 2023, 13:26:23) [MSC v.1916 64 bit (AMD64)]
2.1.0
1.26.0

The dataframes has the same structure, check a sample:

print(real_last.sample(2).T.to_markdown())
43597 9338
Orden 006710000 006781111
Operacion 0010 0020
Operacion.text XXXXXX YYYYYYY
Cl.orden NP NP
Cl.actividad 030 035
Ubic.tecnica XXXX-XX-LAS-DES-BAP19 XXXX-XX-S13-MBA
Status.sistema CTEC NOTI IMOP KKMP PREC LIB. IMOP KKMP PREC
Status.sistema.op NOTI CONT CTEC NLIQ LIB. NLIQ
Stat.Usuario TRAT TRAT
Fe.Entrada 2023-06-25 00:00:00 2023-07-23 00:00:00
Fe.Lib 2023-07-06 00:00:00 2023-07-23 00:00:00
Fe.Ini.real.ot 2023-07-01 00:00:00 NaT
Fe.Ini.real.op 2023-07-01 00:00:00 NaT
Fe.Ini.temp 2023-07-06 00:00:00 2023-07-23 00:00:00
Aviso 00120100 11194911
Modif.por XXXXX005 XXXXX011
Fe.Modif 2023-07-06 00:00:00 2023-07-23 00:00:00
Autor XXXXX003 XXXXX021
Grupo.planif XXT XX1
G.hojas.ruta nan nan
CGH nan nan
Plan.mant.prev nan nan
Pos.PM nan nan
Pto.tbjo.resp XXXXXXXX XXXXXXXX
Pto.tbjo.op XXXXXXXX XXXXXXXX
Cantidad 1 0
Duracion.normal 1.0 0.0
Trabajo 1.0 0.0
Trabajo.real 1.0 0.0
Costos tot.reales 147.03 0.0
Sum.costo.plan 147.03 479.96
Tot.plan.general 147.03 479.96
Total.real.general 147.03 0.0
Costo.dist 0.0 0.0

print(real_exp.sample(2).T.to_markdown())

926 990
Orden 222212222 333323333
Operacion 0120 0040
Operacion.text XXXXXXXXXX YYYYYYYYY
Cl.orden PL PL
Cl.actividad 010 010
Ubic.tecnica XXXX-XX-S07-ALI-CTR7B XXXX-XX-SCA-AL2-AOG1C
Status.sistema CTEC NOTI IMPR FMAT IMOP MOVM NLIQ PREC* LIB. NOTI IMPR DOCU IMOP KKMP NLIQ PREC*
Status.sistema.op NOTI CTEC IMPR NLIQ NOTI CONT IMPR LIB. NLIQ PLAN
Stat.Usuario TBTR TRAT
Fe.Entrada 2023-08-02 00:00:00 2023-08-02 00:00:00
Fe.Lib 2023-08-23 00:00:00 2023-08-21 00:00:00
Fe.Ini.real.ot 2023-09-04 00:00:00 2023-09-05 00:00:00
Fe.Ini.real.op 2023-09-05 00:00:00 2023-09-06 00:00:00
Fe.Ini.temp 2023-09-07 00:00:00 2023-09-04 00:00:00
Aviso 33333333 44444444
Modif.por XXXXX009 XXXXX003
Fe.Modif 2023-09-10 00:00:00 2023-09-07 00:00:00
Autor XXXXXXXXXXXX XXXXXXXXXXXX
Grupo.planif XX0 XXC
G.hojas.ruta 1886 76326
CGH 3 3
Plan.mant.prev 8763 191111
Pos.PM 95475 357140
Pto.tbjo.resp XXXXXXXX XXXXXXXX
Pto.tbjo.op XXXXXXXX XXXXXXXX
Cantidad 4 2
Duracion.normal 4.0 1.0
Trabajo 16.0 2.0
Trabajo.real 16.0 0.5
Costos tot.reales 1627.5 0.04
Sum.costo.plan 2336.45 0.09
Tot.plan.general 2336.45 0.09
Total.real.general 1627.5 0.04
Costo.dist nan nan
like image 696
Ruben Miranda Avatar asked Sep 05 '25 08:09

Ruben Miranda


1 Answers

I can't trigger the ValueError with the given examples but, since your dataframes hold datetimes values, this could be maybe due to a dtypes and/or resolution mismatch like in this Q/A. You can also check GH55067 that discusses a similar issue.

Try this :

real_out = pd.concat([real_exp, real_last.astype(real_exp.dtypes)], axis=0)

Output :

print(real_out)

           Orden Operacion  ... Total.real.general Costo.dist
926    222212222      0120  ...             1627.5        NaN
990    333323333      0040  ...               0.04        NaN
43597  006710000      0010  ...             147.03        0.0
9338   006781111      0020  ...                0.0        0.0

[4 rows x 34 columns]
like image 56
Timeless Avatar answered Sep 09 '25 20:09

Timeless



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!