Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I pull information from Excel into PowerPoint using Python and keep the format?

I've written a script with python's xlrd and pptx to read each workbook in a directory and pull information from each sheet into a table in a PowerPoint slide. It works okay if the excel table is small but I don't know what will be in these excel files. It becomes illegible when there is too many rows and columns. My main problem arose when an excel file had graphs instead of cells and the script couldn't read it. So I tried using pyscreenshot to open the document and take a screenshot but this seems slow and unnecessary. I'd like to make a slide in the PowerPoint look exactly as it would in excel but with the ability to add and change things.

import libraries and modules
import xlrd
from pptx import Presentation
from pptx.util import Inches, Pt
import time
import glob
import os

start = time.time()

prs = Presentation()
title_slide_layout = prs.slide_layouts[0]
slide = prs.slides.add_slide(title_slide_layout)
shapes = slide.shapes
title = slide.shapes.title
subtitle = slide.placeholders[1]

title.text = "Dashboard Generator"
subtitle.text = "made with Python-pptx and xlrd"

for filename in glob.glob(os.path.join("C:/Users/penelope/Desktop/PMO/myfiles/", '*.xlsx')):
    print(filename)
    file_location = filename
    try: 
        workbook = xlrd.open_workbook(file_location)
        nsheets = workbook.nsheets
        for n in range(0, nsheets):
            sheet = workbook.sheet_by_index(n)
            print("sheet:", sheet)
            rows = sheet.nrows
            cols = sheet.ncols
            c = cols
            r = rows
            if c > 0:
                print(c, r)
                slide = prs.slides.add_slide(prs.slide_layouts[5])
                shapes = slide.shapes
                title = slide.shapes.title
                title.text = "Table testing"
                left = Inches(0.0)
                top = Inches(2.0)
                width = Inches(6.0)
                height = Inches(4.0)
                num = 10.0/c
                table = shapes.add_table(rows, cols, left, top, width, height).table
                for i in range(0, c):
                    table.columns[i].width = Inches(num)
                for i in range(0,r):
                    for e in range(0,c):
                        table.cell(i,e).text = str(sheet.cell_value(i,e))
                        cell = table.rows[i].cells[e]
                        paragraph = cell.text_frame.paragraphs[0]
                        paragraph.font.size = Pt(11)
    except:
        print("Error!")
        pass

prs.save('powerpointfile1.pptx')
end = time.time()
print(end - start)

And this is my screenshot script:

import os
import time
import pyscreenshot as ImageGrab
from PIL import Image

if __name__ == "__main__":
    os.system('start excel.exe "C:/Users/penelope/Desktop/PMO/TestCase.xlsx"')
    time.sleep(3)
    im=ImageGrab.grab(bbox=(24,210,1800,990))
    im.save("image7.png")
    img = Image.open('image7.png')
    img.show()
like image 898
Penelope Avatar asked Oct 18 '22 05:10

Penelope


1 Answers

Well, you've chosen a hard problem. Certainly all the times I've attempted this sort of thing I've ended up abandoning the effort.

The fundamental explanation I formed was that Excel (and Word) are "flowed" document environments. That is, when you run out of room on one page, it flows to the next. PowerPoint, on the other hand, is a page-by-page exhibit layout environment. Each slide is independent of the rest (evidenced by the ability to reorder slides freely), each meant to be shown all at once, and not scrolled. This leads to each slide being self-contained, which means constrained to a single "page".

There's a limit to how much information one can place on a slide and still have it communicate. Generally less is better. So, perhaps it's not a surprise all my early efforts there ended in frustration :) I also concluded that an effective "dashboard" slide would require very skillful layout, and extreme restraint on content length, probably requiring specific (human) summarization effort (not just copying from a "database").

Regarding the charts bit, those theoretically can be moved to PowerPoint and I've even seen it done, but it's technically quite challenging. There is no API support for it in python-pptx. This historical issue on the GitHub repo may give some idea what was involved. Not for the faint of heart I expect :)

like image 159
scanny Avatar answered Oct 22 '22 09:10

scanny