Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there the equivalent of to_markdown to read data?

Tags:

python

pandas

With pandas 1.0.0 the use of .to_markdown() to show the content of a dataframe in this forum in markdown is going to proliferate. Is there a convenient way to load the data back into a dataframe? Maybe an option to .from_clipboard(markdown=True)?

like image 916
divingTobi Avatar asked Feb 10 '20 16:02

divingTobi


People also ask

Why can’t I use R code in my Markdown file?

If you run R code in the console or the RStudio GUI (for example, reading in a data set by pasting code into the console or using the Import Dataset button in the Environment tab), you won’t be able to use the results in your markdown file. Any and all commands you need, including reading in data, need to be included in the file.

How do I read a markdown file in Python?

This part is fairly standard Python. We read the markdown file in, line by line, and create two strings, ym that contains the yaml text, and md that contains the markdown text. Python allows us to treat a text file as a sequence of lines of text, that we can loop through using a for loop.

How do I convert a markdown file to HTML?

For example, to convert a Markdown file, you can pass it to the markdown command as follows, replacing filename.md with the name of the file you want to convert: Executing this command will print the HTML code for the Markdown text that’s present in the filename.md file.

Where is the Markdown content of the YAML file?

The yaml is contained between the two '---' markers. The rest of the file (after the second '---') is the markdown content of the file. But for brevity we will call the entire file a markdown file.


1 Answers

You can read markdown tables (or any structured text table) with the pandas read_table function:

Let's create a sample markdown table:

pd.DataFrame({"a": [0, 1], "b":[2, 3]}).to_markdown()                                                                                                                                                    
|    |   a |   b |
|---:|----:|----:|
|  0 |   0 |   2 |
|  1 |   1 |   3 |

As you can see, this is just a structured text table where the delimiters are pipes, there's a lot of whitespace, there are null columns on the left-most and right-most, and there's a header underline that must be dropped.

pd
  # Read a markdown file, getting the header from the first row and inex from the second column
  .read_table('df.md', sep="|", header=0, index_col=1, skipinitialspace=True)
  # Drop the left-most and right-most null columns 
  .dropna(axis=1, how='all')
  # Drop the header underline row
  .iloc[1:]   

   a  b
0  0  2
1  1  3
like image 111
Dave Avatar answered Sep 16 '22 15:09

Dave