Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a function to write certain values of a dataframe to a .txt file in Python?

I have a dataframe as follows:

Index A B C D E F
1     0 0 C 0 E 0 
2     A 0 0 0 0 F
3     0 0 0 0 E 0
4     0 0 C D 0 0 
5     A B 0 0 0 0

Basically I would like to write the dataframe to a txt file, such that every row consists of the index and the subsequent column name only, excluding the zeroes.

For example:

txt file

1 C E 
2 A F 
3 E 
4 C D 
5 A B

The dataset is quite big, about 1k rows, 16k columns. Is there any way I can do this using a function in Pandas?

like image 942
Radhika Avatar asked May 31 '21 10:05

Radhika


People also ask

How do you write data from a DataFrame to a text file?

How do I convert a DataFrame to a text file? Use np. savetxt() to write the contents of a DataFrame into a text file.

How do you convert a DataFrame to text in Python?

If you want to change the data type for all columns in the DataFrame to the string type, you can use df. applymap(str) or df. astype(str) methods.

Does pandas work with TXT?

We can read data from a text file using read_table() in pandas. This function reads a general delimited file to a DataFrame object. This function is essentially the same as the read_csv() function but with the delimiter = '\t', instead of a comma by default.

How do I save a value in a TXT file in Python?

Saving a Text File in Python Python provides two methods for the same. write(): Inserts the string str1 in a single line in the text file. writelines(): For a list of string elements, each string is inserted in the text file.


1 Answers

Take a matrix vector multiplication between the boolean matrix generated by "is this entry "0" or not" and the columns of the dataframe, and write it to a text file with to_csv (thanks to @Andreas' answer!):

df.ne("0").dot(df.columns + " ").str.rstrip().to_csv("text_file.txt")

where we right strip the spaces at the end due to the added " " to the last entries.

If you don't want the name Index appearing in the text file, you can chain a rename_axis(index=None) to get rid of it i.e.,

df.ne("0").dot(df.columns + " ").str.rstrip().rename_axis(index=None)

and then to_csv as above.

like image 50
Mustafa Aydın Avatar answered Oct 17 '22 06:10

Mustafa Aydın