I have a dataframe as follows:
Index A B C D E F
1 0 0 C 0 E 0
2 A 0 0 0 0 F
3 0 0 0 0 E 0
4 0 0 C D 0 0
5 A B 0 0 0 0
Basically I would like to write the dataframe to a txt file, such that every row consists of the index and the subsequent column name only, excluding the zeroes.
For example:
txt file
1 C E
2 A F
3 E
4 C D
5 A B
The dataset is quite big, about 1k rows, 16k columns. Is there any way I can do this using a function in Pandas?
How do I convert a DataFrame to a text file? Use np. savetxt() to write the contents of a DataFrame into a text file.
If you want to change the data type for all columns in the DataFrame to the string type, you can use df. applymap(str) or df. astype(str) methods.
We can read data from a text file using read_table() in pandas. This function reads a general delimited file to a DataFrame object. This function is essentially the same as the read_csv() function but with the delimiter = '\t', instead of a comma by default.
Saving a Text File in Python Python provides two methods for the same. write(): Inserts the string str1 in a single line in the text file. writelines(): For a list of string elements, each string is inserted in the text file.
Take a matrix vector multiplication between the boolean matrix generated by "is this entry "0"
or not" and the columns of the dataframe, and write it to a text file with to_csv
(thanks to @Andreas' answer!):
df.ne("0").dot(df.columns + " ").str.rstrip().to_csv("text_file.txt")
where we right strip the spaces at the end due to the added " "
to the last entries.
If you don't want the name Index
appearing in the text file, you can chain a rename_axis(index=None)
to get rid of it i.e.,
df.ne("0").dot(df.columns + " ").str.rstrip().rename_axis(index=None)
and then to_csv
as above.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With