Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to import tfrecord files in a pandas dataframe?

I have a tfrecord file and would like to import it in a pandas dataframe or numpy array.

I found tools to read tfrecords but they only work inside a tensorflow session, which is not the use case I have...

Thanks for any help I could get !

like image 321
maxk Avatar asked May 10 '19 10:05

maxk


People also ask

What are TFRecord files?

TFRecord is a binary format for efficiently encoding long sequences of tf. Example protos. TFRecord files are easily loaded by TensorFlow through the tf. data package as described here and here.

How do you convert a dataset to a DataFrame in Python?

You can convert the sklearn dataset to pandas dataframe by using the pd. Dataframe(data=iris. data) method.

How do I import data into a pandas Dataframe?

When importing data into Pandas dataframes, you can also save time and write less code by defining which columns to import, rename the columns, set their data types, define the index, and many other things. Here are some handy tips to help you. To get started, create a new Jupyter notebook and load the Pandas library.

What is thedf folder in pandas Dataframe?

df - pandas dataframe. Please keep in mind above info about nested sequences. folder - folder to save tfrecords, local or S3. Please be sure that it doesn’t contain other files or folders, if you want to read from this folder then.

How to upload data from Excel to pandas?

Excel is also a source of huge data. The pandas library provides a read_excel method to upload an excel file. There is a parameter “sheet_name” which holds the sheet number which should be uploaded. For example, you want to upload the data of the first sheet of an excel then sheet_name will hold value 0.

How do I call a pandas function from a CSV file?

When importing the Pandas package the convention is to use the command import pandas as pd which allows you to call Pandas functions by prefixing them with pd. instead of pandas.. Comma Separated Value or CSV files are likely to be the file format you encounter most commonly in data science.


1 Answers

In Colab you can type (or on your cmd without !)

!pip install pandas-tfrecords

After installation you can use:

import pandas as pd
import pandas_tfrecords as pdtfr
pdtfr.tfrecords_to_pandas(file_paths=r'/folder/file.tfrecords')

Good luck!

like image 142
vinicius motta Avatar answered Sep 16 '22 15:09

vinicius motta