Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Converting .arff file to .csv using Python

Tags:

python

csv

arff

I have a file "LMD.rh.arff" which I am trying to convert to .csv file using the following code-

import pandas as pd
import matplotlib.pyplot as plt
from scipy.io import arff


# Read in .arff file-
data = arff.loadarff("LMD.rh.arff")

But this last line of code gives me the error-

--------------------------------------------------------------------------- UnicodeEncodeError Traceback (most recent call last) in ----> 1 data = arff.loadarff("LMD.rp.arff")

~/.local/lib/python3.6/site-packages/scipy/io/arff/arffread.py in loadarff(f) 539 ofile = open(f, 'rt') 540 try: --> 541 return _loadarff(ofile) 542 finally: 543 if ofile is not f: # only close what we opened

~/.local/lib/python3.6/site-packages/scipy/io/arff/arffread.py in _loadarff(ofile) 627 a = generator(ofile) 628 # No error should happen here: it is a bug otherwise --> 629 data = np.fromiter(a, descr) 630 return data, meta 631

UnicodeEncodeError: 'ascii' codec can't encode character '\xf3' in position 4: ordinal not in range(128)

In [6]: data = arff.loadarff("LMD.rh.arff")

--------------------------------------------------------------------------- UnicodeEncodeError Traceback (most recent call last) in ----> 1 data = arff.loadarff("LMD.rh.arff")

~/.local/lib/python3.6/site-packages/scipy/io/arff/arffread.py in loadarff(f) 539 ofile = open(f, 'rt') 540 try: --> 541 return _loadarff(ofile) 542 finally: 543 if ofile is not f: # only close what we opened

~/.local/lib/python3.6/site-packages/scipy/io/arff/arffread.py in _loadarff(ofile) 627 a = generator(ofile) 628 # No error should happen here: it is a bug otherwise --> 629 data = np.fromiter(a, descr) 630 return data, meta 631

UnicodeEncodeError: 'ascii' codec can't encode character '\xf3' in position 4: ordinal not in range(128)

You can download the file arff_file

Any ideas as to what's going wrong?

Thanks!

like image 238
Arun Avatar asked Jun 21 '26 01:06

Arun


1 Answers

Try this

path_to_directory="./"
files = [arff for arff in os.listdir(path_to_directory) if arff.endswith(".arff")]

def toCsv(content): 
    data = False
    header = ""
    newContent = []
    for line in content:
        if not data:
            if "@attribute" in line:
                attri = line.split()
                columnName = attri[attri.index("@attribute")+1]
                header = header + columnName + ","
            elif "@data" in line:
                data = True
                header = header[:-1]
                header += '\n'
                newContent.append(header)
        else:
            newContent.append(line)
    return newContent

# Main loop for reading and writing files
for zzzz,file in enumerate(files):
    with open(path_to_directory+file , "r") as inFile:
        content = inFile.readlines()
        name,ext = os.path.splitext(inFile.name)
        new = toCsv(content)
        with open(name+".csv", "w") as outFile:
            outFile.writelines(new)
like image 169
Shubham Mishra Avatar answered Jun 22 '26 16:06

Shubham Mishra