Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Export chat messages in Microsoft Teams channels programmatically to word or pdf without needing admin roles, privileges or permissions

I am looking for a programmatic way to export all chat messages (textual and image contents) of each Microsoft Teams channel to a word or pdf document (any output medium that supports text and messages). I need to be able to do this without needing to seek permissions from the corporate global admin for specific roles. I have studied different methods already like the Graph API (azure app registration) , ediscovery, and extracting this information from a hidden outlook folder. The common theme in these methods is that we will need to seek permissions from the IT admin for a specific need in the export strategy.

So far, I have tried using the web app version of Microsoft Teams and using web scraping methods, I have been able to cycle through messages in each channel and export them into a word document. I was wondering if there was a more elegant, better method with lesser likelihood of being error-prone.

Looking for some suggestions.

like image 602
user3791998 Avatar asked Oct 21 '25 16:10

user3791998


2 Answers

This Javascript script of mine can surely be optimized but it works. To use it, you should browse to the teams conversation of your choosing, directly on the https://teams.microsoft.com website. Throw the script in the console pane of your developer tools, INSPECT at least one element within the chat pane, then run the script.

In a nutshell... the script looks for and copies the div elements in the page which I identified to be holding the individual messages. It keeps a copy of each message as the page scrolls up. When the script thinks it is done scrolling, it puts all the stored messages in a new browser window which opens for you. You can decide to print, or save the content of the new window how you like.

Unless you want to stack multiple chats, the first pass on any given chat history should be done calling main() or main(true). All subsequent passes should be done using main(false).

Needless to say, the longer the chat history, the longer it will take the process to complete. With this script, I have successfully transferred multiples chat history to PDF.

/*
For this script to work you must be viewing Teams from your browser.
https://teams.microsoft.com
This script was tested in Firefox.
You must be using Developer tools of your browser
YOU MUST INSPECT a message, otherwise the HTML elements will not be available
The elements are burried in an iFrame, and you must inspect something for the iFrame code to be available to the script
Once you have inspected something, go to the Console tab of your developer tools
paste this whole script and run it
You can run it by pressing the Play button,
or with the keyboard shortcut Ctrl + Enter
It will first scroll all the way to the first message
Then it will create a new browser window with the messages in it
Then it is up to you to do what you want with it
I suggest you print the content to pdf (Ctrl + P)

It is possible the process will stop before reaching the top
If that is the case and you want to keep scrolling,
you can retrigger the script but call it with a false parameter to maintain the existing storage list
main(false).
*/

/* 
 * how long to wait before the sniffing the message pane after a scroll
 * if it is set too low, or not at all, the messages do not have enough time to load
 * the slower your computer or connection, the higher this value should be
 * the higher this value will be, the longer it will take to process a chat window
 */
var delayTime = 500;
/* log messages to console */
var logToConsole = true;
/* perhaps you only need a few scrolls worth or data */
var maxScrollOccurences = 0;//0 for inifinite
/* 
 * sometimes having the right delay value is not enough
 * so, we add a fudge factor in case the delay was not long enough
 * this allows us to re-sniff the message pane without scrolling again
 * when we run out of fudge, we assume that we have reached the top of the message pane
 */
var maxFudge = () => { return 10; }

/*
you should not need to change any of the variables below
 */


/*
 * This is the div in which the scroll bar resides
 */
var scrollDiv = () => { return document.getElementById("main-window-body").querySelector('[data-tid="message-pane-list-viewport"]'); }
/* 
 * this is where all the messages are displayed
 */
var chatPaneList = () => { return document.getElementById("chat-pane-list"); }
/*
 * all the messages being displayed in the chat pane
 * in a virtual list
 * as more are added, some are dropped and some loose their content
 * so, you must keep track of the messages as you scroll up the message pane
 */
var messagesInChatPane = () => { return chatPaneList().querySelectorAll('.fui-unstable-ChatItem'); }
/* this variable is to stop the scrolling while */
var stop;
/* just a means to keep track of wheter we should keep scrolling or not */
var msgOnSCreenBeforeScroll;
/* just a means to keep track of wheter we should keep scrolling or not */
var msgOnSCreenAfterScroll;
/* this keeps track of how many times we have scrolled so far */
var scrollOccurences = 0;
/* this array will hold the messages to display */
var messagesToDisplay = [];

var firstRender = true;

main();//run this one to start a fresh process
//main(false);//run this one if you want to keep processing an existing window

/*
 * This is the only function call you should be calling
 * set FirstRender to true, or omit it, when you run the script for the first time in a chat pane
 * but, set FirstRender to false for all subsequent runs of the same chat pane
 */
async function main(FirstRender) {
    if (FirstRender != undefined) {
        firstRender = FirstRender
    }
    await ScrollToTopOfMessages();
    PrintIt();
}




async function ScrollToTopOfMessages() {
    var fudgeFactor = maxFudge();
    stop = false;
    msgOnSCreenBeforeScroll = 0;

    if (firstRender) {
        /* start with an empty list */
        messagesToDisplay = [];
    }

    while (!stop) {
        msgOnSCreenBeforeScroll = messagesToDisplay.length;
        LogThis("displayed before: " + msgOnSCreenBeforeScroll);
        await ScrollAndWait();
        scrollOccurences++;
        msgOnSCreenAfterScroll = messagesToDisplay.length;
        LogThis("displayed after: " + msgOnSCreenAfterScroll);

        //if more items are displayed than before, we continue scrolling
        //otherwise we stop and proceed to the next step
        var fudging = false;
        if (msgOnSCreenBeforeScroll == msgOnSCreenAfterScroll) {
            scrollOccurences--;//fudges do no count count. only scrolls that actually occurred
            fudgeFactor--;
            LogThis("fudging " + fudgeFactor + "/" + maxFudge());
            fudging = true;
            if (fudgeFactor == 0) {
                stop = true;
                LogThis("too much fudging");
            }
        }
        else {
            fudgeFactor = maxFudge();
        }

        /*
         * stop scrolling if we have reached the maxScrollOccurences
         */
        if (!fudging && maxScrollOccurences > 0 && scrollOccurences == maxScrollOccurences) {
            LogThis("bumped into maxScrollOccurences");
            stop = true;
        }
    }
}

async function ScrollAndWait() {
    //msgOnSCreenBeforeScroll = msgOnSCreenAfterScroll;
    scrollDiv().scrollTop = 0;
    await delay(delayTime);//let the messages load

    var messages = messagesInChatPane();
    for (var i = messages.length - 1; i > -1; i--) {
        if (messages[i].querySelector('.fui-Divider') != undefined) {
            //we do not want dividers
            LogThis("skipping divider");
            continue;
        }
        var msgId = messages[i].querySelector('[id ^= "timestamp-"]');
        if (msgId == undefined) {
            LogThis("message with no time stamp... ");
            LogThis(messages[i]);
            continue;
        }
        msgId = msgId.id.replace("timestamp-", "");


        if (messagesToDisplay.findIndex(p => p.id === msgId) == -1) {
            LogThis(messagesToDisplay.length + " before adding to messagesToDisplay");
            var en = new entry(messagesToDisplay.length, msgId, messages[i]);
            messagesToDisplay.push(en);
            LogThis(messagesToDisplay.length + " after adding to messagesToDisplay");
        }
    }
}

async function delay(time) {
    return new Promise(resolve => setTimeout(resolve, time));
}

async function PrintIt() {
    LogThis(messagesToDisplay);
    var WinPrint = window.open('', '', 'left=0,top=0,width=800,height=900,toolbar=0,scrollbars=0,status=0');
    for (var i = messagesToDisplay.length - 1; i > -1; i--) {
        var msgDiv = messagesToDisplay[i].element.cloneNode(true);
        msgDiv = RemoveHeaderDiv(msgDiv);
        msgDiv = CleanupEmoji(msgDiv);
        msgDiv = FixImages(msgDiv);
        WinPrint.document.body.appendChild(msgDiv);
    }
    //WinPrint.document.close();
    WinPrint.focus();
    //WinPrint.print();
    //WinPrint.close();
}

/*
 * the header contains the message preview
 * useless in what we are trying to do here
 */
function RemoveHeaderDiv(element) {
    var msgHeader = element.querySelector('[role="heading"]');
    if (msgHeader != undefined) {
        msgHeader.remove();
    }
    return element;
}

/* 
 * the css is not maintained in the new browser window, on purpose
 * and it breaks the Emojis
 * this fixes them
 */
function CleanupEmoji(element) {
    element.querySelectorAll('[itemtype="http://schema.skype.com/Emoji"]').forEach(p => p.setAttribute("src", ""))
    return element;
}

/*
images appear in the new browser window
but most of the images disappear when the window gets printed to pdf
this fixes the problem
 */
function FixImages(element) {
    var imgs = element.querySelectorAll("img");
    for (var i = 0; i < imgs.length; i++) {
        var eleParent = imgs[i].parentElement;
        var tmp = imgs[i].getAttribute("data-orig-src");
        if (tmp != undefined) {
            imgs[i].remove();
            var newimg = document.createElement("img");
            newimg.src = tmp;
            eleParent.appendChild(newimg);
        }
    }
    return element;
}

function LogThis(messageToLog) {
    if (logToConsole) {
        console.log(messageToLog);
    }
}

function entry(index, id, element) {
    this.index = index;
    this.id = id;
    this.element = element;
}
like image 76
blaze_125 Avatar answered Oct 23 '25 07:10

blaze_125


Here's a nominal solution. It admittedly is not great/complete. But it does work to some degree.

You inspired me to clean up my attempt and publish it:

https://github.com/poleguy/selenium_teams

Tested on ubuntu 20.04 only.

Clone the repo.

Run the ./setup_python to get a conda environment.

Edit the script to specify your url/login.

Run python ./selenium_teams.py

Log in manually in the browser that pops up.

Go to the chat you want to slurp. Click in the "Type a new message" section.

Press enter to let python continue.

This will start to save all the messages to a text file.

Once you have the text file convert it to pdf or word.

(Incomplete: can't do images, runs very slowly, may run out of memory.... very little testing.)

like image 42
poleguy Avatar answered Oct 23 '25 07:10

poleguy



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!