With pandas 1.0.0 the use of .to_markdown() to show the content of a dataframe in this forum in markdown is going to proliferate. Is there a convenient way to load the data back into a dataframe? Maybe an option to .from_clipboard(markdown=True)? 
If you run R code in the console or the RStudio GUI (for example, reading in a data set by pasting code into the console or using the Import Dataset button in the Environment tab), you won’t be able to use the results in your markdown file. Any and all commands you need, including reading in data, need to be included in the file.
This part is fairly standard Python. We read the markdown file in, line by line, and create two strings, ym that contains the yaml text, and md that contains the markdown text. Python allows us to treat a text file as a sequence of lines of text, that we can loop through using a for loop.
For example, to convert a Markdown file, you can pass it to the markdown command as follows, replacing filename.md with the name of the file you want to convert: Executing this command will print the HTML code for the Markdown text that’s present in the filename.md file.
The yaml is contained between the two '---' markers. The rest of the file (after the second '---') is the markdown content of the file. But for brevity we will call the entire file a markdown file.
You can read markdown tables (or any structured text table) with the pandas read_table function:
Let's create a sample markdown table:
pd.DataFrame({"a": [0, 1], "b":[2, 3]}).to_markdown()                                                                                                                                                    
|    |   a |   b |
|---:|----:|----:|
|  0 |   0 |   2 |
|  1 |   1 |   3 |
As you can see, this is just a structured text table where the delimiters are pipes, there's a lot of whitespace, there are null columns on the left-most and right-most, and there's a header underline that must be dropped.
pd
  # Read a markdown file, getting the header from the first row and inex from the second column
  .read_table('df.md', sep="|", header=0, index_col=1, skipinitialspace=True)
  # Drop the left-most and right-most null columns 
  .dropna(axis=1, how='all')
  # Drop the header underline row
  .iloc[1:]   
   a  b
0  0  2
1  1  3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With