Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convert a tab delimited text file to a csv file in Python

Tags:

python-3.x

csv

I have the following problem:

I want to convert a tab delimited text file to a csv file. The text file is the SentiWS dictionary which I want to use for a sentiment analysis ( https://github.com/MechLabEngineering/Tatort-Analyzer-ME/tree/master/SentiWS_v1.8c ).

The code I used to do this is the following:

txt_file = r"SentiWS_v1.8c_Positive.txt"
csv_file = r"NewProcessedDoc.csv"

in_txt = csv.reader(open(txt_file, "r"), delimiter = '\t')
out_csv = csv.writer(open(csv_file, 'w'))

out_csv.writerows(in_txt)

This code writes everything in one row but I need the data to be in three rows as normally intended from the file itself. There is also a blank line under each data and I don´t know why.

I want the data to be in this form:

Row1 Row2 Row3

Word Data Words

Word Data Words

instead of

Row1

Word,Data,Words

Word,Data,Words

Can anyone help me?

like image 340
gHOsTaManTe Avatar asked Mar 14 '17 02:03

gHOsTaManTe


People also ask

How do I convert a tab separated text to CSV?

Again, click the File tab in the Ribbon menu and select the Save As option. In the Save As window, select the CSV (Comma delimited) (*. csv) option in the Save as type drop-down menu. Type a name for the CSV file in the File name field, navigate to where you want to save the file, then click the Save button.

How do I save a tab-delimited file as a CSV file?

Go to File > Save As. Click Browse. In the Save As dialog box, under Save as type box, choose the text file format for the worksheet; for example, click Text (Tab delimited) or CSV (Comma delimited).

Can we convert TXT to CSV in Python?

Let's see how to Convert Text File to CSV using Python Pandas. Python will read data from a text file and will create a dataframe with rows equal to number of lines present in the text file and columns equal to the number of fields present in a single line.


2 Answers

Try this:

import csv

txt_file = r"SentiWS_v1.8c_Positive.txt"
csv_file = r"NewProcessedDoc.csv"

with open(txt_file, "r") as in_text:
    in_reader = csv.reader(in_text, delimiter = '\t')
    with open(csv_file, "w") as out_csv:
        out_writer = csv.writer(out_csv, newline='')
        for row in in_reader:
            out_writer.writerow(row)

There is also a blank line under each data and I don´t know why.

You're probably using a file created or edited in a Windows-based text editor. According to the Python 3 csv module docs:

If newline='' is not specified, newlines embedded inside quoted fields will not be interpreted correctly, and on platforms that use \r\n linendings on write an extra \r will be added. It should always be safe to specify newline='', since the csv module does its own (universal) newline handling.

like image 45
Dan Avatar answered Oct 03 '22 00:10

Dan


import pandas

It will convert tab delimiter text file into dataframe

dataframe = pandas.read_csv("SentiWS_v1.8c_Positive.txt",delimiter="\t")

Write dataframe into CSV

dataframe.to_csv("NewProcessedDoc.csv", encoding='utf-8', index=False)
like image 139
Sunku Vamsi Tharun Kumar Avatar answered Oct 03 '22 01:10

Sunku Vamsi Tharun Kumar