Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

'Columns must be same length as key' error when trying .Split

The code below just runs fine with Python 3.8.10 but does not run in Python 3.10. Any idea what could be the problem?

import pandas as pd
import requests

url = "https://coinmarketcap.com/new/"
page = requests.get(url,headers={'User-Agent': 'Mozilla/5.0'}, timeout=1)
pagedata = page.text
usecols = ["Name", "Symbol", "1h", "24h", "MarketCap"]


df = pd.read_html(page.text)[0]
df[["Name", "Symbol"]] = df["Name"].str.split(r"\d+", expand=True)

df = (df.rename(columns={"Fully Diluted Market Cap": "MarketCap"})[usecols]
          .sort_values("24h", ascending=False, key=lambda ser: ser.str.replace("%", "").astype(float))
          .replace(r"^\$", "", regex=True)
     )

numcols = df.columns[~df.columns.isin(['Name'])]
df = df.head(5).to_markdown(index=True)
print (df)

Current Output:

Traceback (most recent call last):
  df[["Name", "Symbol"]] = df["Name"].str.split(r"\d+", expand=True)
  ....
  ....
  ValueError: Columns must be same length as key

Correct Output: (Output in Python 3.8)

|    | Name        | Symbol    | 1h     | 24h      | MarketCap   |
|---:|:------------|:----------|:-------|:---------|:------------|
|  3 | Shrekt      |4HREK      | 23.82% | 2536.51% | 342,357     |
|  8 | BLAZE       |TOKEN9BLZE | 1.07%  | 106.71%  | 3,828,088   |
| 26 | Goner27     |GONER      | 6.32%  | 88.09%   | 1,094,010   |
| 14 | Party Hat15 |PHAT       | 13.34% | 81.64%   | 60,136      |
| 29 | PepeChat    |30PPC      | 48.01% | 78.25%   | 431,159     |
like image 373
Drew Duazeh Avatar asked Mar 07 '26 21:03

Drew Duazeh


2 Answers

I think it has to do with one of the values (NOOT (BRC-20)4NOOT) found in the column Name.

To handle this, we can try to split on the last number found in each row of this column.

Replace this :

df[["Name", "Symbol"]] = df["Name"].str.split(r"\d+", expand=True)

By this :

df[["Name", "Symbol"]] = df["Name"].str.split(r"\d+(?!.*\d)", expand=True)

Regex [demo]

Output :

print(df)

|    | Name        | Symbol   | 1h     | 24h      | MarketCap   |
|---:|:------------|:---------|:-------|:---------|:------------|
|  5 | Shrekt      | HREK     | 54.61% | 1124.57% | 159,013     |
| 10 | BLAZE TOKEN | BLZE     | 2.40%  | 109.53%  | 3,880,242   |
|  8 | CMC DOGE    | CMCDOGE  | 12.93% | 102.76%  | 169,492     |
| 28 | Goner       | GONER    | 1.37%  | 88.66%   | 1,050,089   |
|  4 | nomeme      | NOMEME   | 53.86% | 86.14%   | 4,603,393   |
like image 122
Timeless Avatar answered Mar 09 '26 09:03

Timeless


Do you have to use keep using df as dataframe. I think you can create a new one and use that one after the function. Try just doing this

newDF = df["Name"].str.split(r"\d+", expand=True)
print(newDF)

Edit Fixed Code:

df["Name"] = df["Name"].str.replace("\(BRC-20\)","")

add this line to your code which will replace anything that has (BRC-20) in it. So the problem wasn't being about the version of your python.

like image 22
Taha Kınık Avatar answered Mar 09 '26 10:03

Taha Kınık