Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Python-pptx, what conditions could a PowerPoint have that give KeyError?

I have a PowerPoint that I would like to open, amend, and save as a different filename. However, I'm getting a KeyError.

I tried this code with a blank PowerPoint presentation and it works perfectly. However, when I use the code to ope an existing PowerPoint presentation and try to run the same code, I get a KeyError.

KeyError: "There is no item named 'ppt/slides/NULL' in the archive"

#Replace Source Text

import re
#s = "string. With. Punctuation?"
#s = re.sub(r'[^\w\s]','',s)

search_str = '{{{FILTER}}}'
repl_str = re.sub(r'[^\w\s]','',(str(list(dashboard_filter2.values()))))
ppt = Presentation('HispPres1.pptx')

for slide in ppt.slides:
    for shape in slide.shapes:
        if shape.has_text_frame:
            shape.text = shape.text.replace(search_str, repl_str)
ppt.save('HispPresSourceUpdate.pptx')

I expect to have the existing PowerPoint amended by finding all the instances of {{{FILTER}}} and replacing it with the value listed. However, it looks like there's a problem using my existing PowerPoint presentation. I don't have this issue with a blank presentation.

So, I'm wondering what would cause an existing PowerPoint presentation to raise an error??? I plan on making several "templates" to start with and really need to know if there are any hardfast rules to adhere to.

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-42-41deffabe2f9> in <module>()
      7 search_str = '{{{FILTER}}}'
      8 repl_str = re.sub(r'[^\w\s]','',(str(list(dashboard_filter2.values()))))
----> 9 ppt = Presentation('HispPres1.pptx')
     10 
     11 for slide in ppt.slides:

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pptx\api.py in Presentation(pptx)
     28         pptx = _default_pptx_path()
     29 
---> 30     presentation_part = Package.open(pptx).main_document_part
     31 
     32     if not _is_pptx_package(presentation_part):

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pptx\opc\package.py in open(cls, pkg_file)
    120         *pkg_file*.
    121         """
--> 122         pkg_reader = PackageReader.from_file(pkg_file)
    123         package = cls()
    124         Unmarshaller.unmarshal(pkg_reader, package, PartFactory)

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pptx\opc\pkgreader.py in from_file(pkg_file)
     34         pkg_srels = PackageReader._srels_for(phys_reader, PACKAGE_URI)
     35         sparts = PackageReader._load_serialized_parts(
---> 36             phys_reader, pkg_srels, content_types
     37         )
     38         phys_reader.close()

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pptx\opc\pkgreader.py in _load_serialized_parts(phys_reader, pkg_srels, content_types)
     67         sparts = []
     68         part_walker = PackageReader._walk_phys_parts(phys_reader, pkg_srels)
---> 69         for partname, blob, srels in part_walker:
     70             content_type = content_types[partname]
     71             spart = _SerializedPart(partname, content_type, blob, srels)

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pptx\opc\pkgreader.py in _walk_phys_parts(phys_reader, srels, visited_partnames)
    102             yield (partname, blob, part_srels)
    103             for partname, blob, srels in PackageReader._walk_phys_parts(
--> 104                     phys_reader, part_srels, visited_partnames):
    105                 yield (partname, blob, srels)
    106 

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pptx\opc\pkgreader.py in _walk_phys_parts(phys_reader, srels, visited_partnames)
    102             yield (partname, blob, part_srels)
    103             for partname, blob, srels in PackageReader._walk_phys_parts(
--> 104                     phys_reader, part_srels, visited_partnames):
    105                 yield (partname, blob, srels)
    106 

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pptx\opc\pkgreader.py in _walk_phys_parts(phys_reader, srels, visited_partnames)
     99             visited_partnames.append(partname)
    100             part_srels = PackageReader._srels_for(phys_reader, partname)
--> 101             blob = phys_reader.blob_for(partname)
    102             yield (partname, blob, part_srels)
    103             for partname, blob, srels in PackageReader._walk_phys_parts(

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pptx\opc\phys_pkg.py in blob_for(self, pack_uri)
    107         matching member is present in zip archive.
    108         """
--> 109         return self._zipf.read(pack_uri.membername)
    110 
    111     def close(self):

~\AppData\Local\Continuum\anaconda3\lib\zipfile.py in read(self, name, pwd)
   1312     def read(self, name, pwd=None):
   1313         """Return file bytes (as a string) for name."""
-> 1314         with self.open(name, "r", pwd) as fp:
   1315             return fp.read()
   1316 

~\AppData\Local\Continuum\anaconda3\lib\zipfile.py in open(self, name, mode, pwd, force_zip64)
   1350         else:
   1351             # Get info object for name
-> 1352             zinfo = self.getinfo(name)
   1353 
   1354         if mode == 'w':

~\AppData\Local\Continuum\anaconda3\lib\zipfile.py in getinfo(self, name)
   1279         if info is None:
   1280             raise KeyError(
-> 1281                 'There is no item named %r in the archive' % name)
   1282 
   1283         return info

KeyError: "There is no item named 'ppt/slides/NULL' in the archive"
like image 330
SoSincere3 Avatar asked Feb 03 '26 19:02

SoSincere3


1 Answers

Yeah, this is a bit of a thorny problem. The spec doesn't provide for a "broken" relationship (one that refers to a package-part that doesn't exist), but at least one library (Java-based if I recall correctly) does not clean up relationships properly in some cases, perhaps a slide delete operation in this case.

The gist of the explanation is this:

  • A PPTX file is an Open Packaging Convention (OPC) package. DOCX and XLSX files are other examples of OPC packages.
  • An OPC package is a Zip archive of multiple parts (official term, perhaps package-part more precisely). Each part is essentially a file, so something like slide1.xml, and they are arranged in a "directory structure".
  • One part can be related to other parts. For example, a presentation part (presentation.xml) is related to each of its slide parts. These relationships are stored in a file like presentation.xml.rels. The relationship is keyed with a string like "rId3" and identifies the related part by its path in the package.
  • One part refers to another using the key in its XML (e.g. <p:sldId r:id="rId3"/>). The target part is "looked-up" in the .rels file to find its path and get to it that way.
  • The KeyError you're getting means that the .rels file has a <Relationship> element referring to the part ppt/slides/NULL (instead of something like ppt/slides/slide3.xml). Since there is no such part in the package, the lookup fails.

If you open the "template" file in PowerPoint and save it, I think it will repair itself. You might need to rearrange a slide and move it back to jostle that part of the code.

If that doesn't work, you'll need to patch the package by hand, removing any broken references and relationships. opc-diag can be handy for that.

like image 199
scanny Avatar answered Feb 06 '26 13:02

scanny



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!