Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extracting all text from a powerpoint file in VBA

I have a huge set of powerpoint files from which I want to extract all the text and just lump it all into one big text file. Each source (PPT) file has multiple pages (slides). I do not care about formatting - only the words.

I could do this manually with a file by just ^A ^C in PPT, followed by ^V in notepad; then page down in the PPT, and repeat for each slide in the powerpoint. (Too bad I can't just do a ^A that would grab EVERYTHING ... then I could use sendkey to copy / paste)

But there are many hundreds of these PPTs with different numbers of slides.

It seems like this would be a common thing to want to do, but I can't find an example anywhere.

Does anyone have sample code to do this?

like image 224
elbillaf Avatar asked Jan 12 '11 23:01

elbillaf


People also ask

How do I export all text from PowerPoint?

If all PowerPoint text is in placeholders, please choose File>Export and choose Rich /text Format (. rtf). You can then save the RTF as plain text in TextEdit. If you have text in text boxes as well as in placeholders, then export as PDF.


1 Answers

Here's some code to get you started. This dumps all text in slides to the debug window. It doesn't try to format, group or do anything other than just dump.

Sub GetAllText()
Dim p As Presentation: Set p = ActivePresentation
Dim s As Slide
Dim sh As Shape
For Each s In p.Slides
    For Each sh In s.Shapes
        If sh.HasTextFrame Then
            If sh.TextFrame.HasText Then
                Debug.Print sh.TextFrame.TextRange.Text
            End If
        End If
    Next
Next
End Sub
like image 52
Todd Main Avatar answered Oct 23 '22 04:10

Todd Main