Get Outlook calendar meeting data using Python

Sharing is caring!

Last Updated on July 14, 2022 by Jay

Recently I made a little program to get Outlook calendar meeting data using Python. The program was a result of my own curiosity and, most importantly, out of frustration with too many work meetings.

I work for an insurance company in a major North American city. Everyone in the company has been working from home since March when Covid hit. In order to “keep everyone informed with more frequent communication”, people started meeting online more often. As a result, my Outlook calendar started to get really crowded. Sometimes I feel less efficient at work, because of so many meetings. I needed quiet time so I can focus on things and make progress.

So I made a case to my manager & team. I showed them how much time we spent on meetings. Without a question, some meetings are essential. My point was that we spent way too much time than necessary to touch base and provide updates. In my profession, we need proof that is based on data. Because almost all the meetings are booked on Outlook calendar, that’s a good starting point.

Try the code here and leave a comment to let me know how many hours you spent on “weekly update meetings”. You might be surprised. Keep reading to see how many hours I’ve spent…

The process

The thought process is a simple 3-step:

  1. Get data from Outlook
  2. Filter & clean up the data
  3. Create a summary report for easy viewing

Remember, we are lazy workers, so we don’t go into Outlook and copy/paste or manually enter info into an Excel spreadsheet. We let machines do the heavy lifting. Once again, Python proved to be a powerful ally.

Part 0. Required libraries

We are going to need a few libraries: win32com, pandas and datetime.

datetime is a Python built-in library. win32com and pandas are external libraries that require installation.

  • win32com is a powerful library that provides access to many of the Windows APIs from Python. Yes, it means you can use win32com to automate many Windows tasks using Python.
  • pandas library is the golden standard for data analysis.

Let’s go ahead and install the external libraries.

pip install pandas
pip install win32com

ERROR: Could not find a version that satisfies the requirement win32com (from versions: none)
ERROR: No matching distribution found for win32com
WARNING: You are using pip version 20.0.2; however, version 20.1.1 is available.
You should consider upgrading via the 'c:\program files\python38\python.exe -m pip install --upgrade pip' command.

No worries if you see this error! It turned out that the win32com library isn’t really named as “win32com”… the correct name is pywin32 when you want to install.

pip install pywin32

So that worked. I’m not sure why the weird naming convention, if anyone knows the history, please do let me know!

We are going to organize our code into several functions, so it’s easy to reuse the code later.

Part 1. Get Outlook calendar meeting in Python

def get_calendar(begin,end):
    outlook = win32com.client.Dispatch('Outlook.Application').GetNamespace('MAPI')
    calendar = outlook.getDefaultFolder(9).Items
    calendar.IncludeRecurrences = True
    calendar.Sort('[Start]')

    restriction = "[Start] >= '" + begin.strftime('%m/%d/%Y') + "' AND [END] <= '" + end.strftime('%m/%d/%Y') + "'"
    calendar = calendar.Restrict(restriction)
    return calendar

We created an “outlook” object in the first line (after the def get_calendar() line). Then outlook.getDefaultFolder(9) gives a list of all the meetings from our Outlook calendar; whereas outlook.getDefaultFolder(6) gives all emails in the inbox. The third line calendar.IncludeRecurrences = True is important because we want to include recurrent meetings. i.e. “weekly update on xxx”.

Then we select a time frame to focus on only a time period. In my case, I used Jan 1, 2020 to July 31, 2020. The time frame is simply a string that looks like this: “[Start] >= ’01/01/2020′ AND [END] <= ’07/31/2020′”. To apply a filter, we use calendar.Restrict() method.for apply a restriction to (a string value) to the calendar to focus on a time period.

datetime.strftime()

The strftime() is a datetime function that allows us to convert a datetime object into a string representing the date and time. For example, the following code converts today (a datetime object) into a string value of “08-04-2020”.

Datetime strftime example

datetime.strptime()

There’s a similar datetime function strptime() but does exactly the opposite – it converts a string value into a datetime object. I always have trouble remembering which one is which. One trick I’ve used to help remembering is that, strftime means “string from time (object)”, so it will give me a string; and the other one automatically does the opposite. I don’t know if the “f” means “from”, but I’ve found this technique helpful. If it doesn’t work for you, it’s just one google away from finding the answer on the Internet.

Finally, we return the calendar object, so we can use it in later part of the code.

Part 2. Filter Outlook calendar items

The previous function returns a calendar object, which is a list containing all the calendar meetings. We can try to access it to see what’s inside.

import datetime as dt
cal = get_calendar(dt.datetime(2020,1,1), dt,datetime(2020,7,31))
Outlook COMObject in Python
Outlook COMObject in Python

Ok, that’s not very helpful. But this object is actually a list-like object, and we can access individual elements just like a normal list.

cal[1].subject gives the meeting name of the 2nd meeting in that list. cal[1].start gives the starting time for the same meeting. Similarly, cal[1].body contains the content (body) of the meeting invite.

Content in the Outlook COMObject
Content in the Outlook COMObject

For this exercise, I don’t want to include all meetings from my Outlook calendar. Since I sometimes block off a chunk of time for myself to focus on work, I don’t want to count those hours as “meetings”. So we need a way to remove unwanted meeting. Our next function will apply filters to the list of all meetings.

def get_appointments(calendar,subject_kw = None,exclude_subject_kw = None, body_kw = None):

    if subject_kw == None:
        appointments = [app for app in calendar]    
    else:
        appointments = [app for app in calendar if subject_kw in app.subject]

    if exclude_subject_kw != None:
        appointments = [app for app in appointments if exclude_subject_kw not in app.subject]
    cal_subject = [app.subject for app in appointments]
    cal_start = [app.start for app in appointments]
    cal_end = [app.end for app in appointments]
    cal_body = [app.body for app in appointments]

    df = pd.DataFrame({'subject': cal_subject,
                       'start': cal_start,
                       'end': cal_end,
                       'body': cal_body})
    return df

This function takes three keyword arguments:

  • subject_kw: keyword in the subject of the meeting invite.
  • exclude_subject_kw: the subject keyword to remove.
  • body_kw: keyword in the body of the meeting invite.

Basically we can loop through the list of meetings and either keep/drop them based on our selection criteria. So list comprehensions becomes super handy in this case.

appointments = [app for app in calendar if subject_kw in app.subject] checks if a keyword is found in the meeting’s subject, then adds the meeting into a list called appointments.

On the last step, we construct a dataframe to store the clean data, then return that dataframe. Now we have meeting names and start/end times, we can go ahead and generate a report!

Part 3. Generate time usage report

This is the easy part, and I couldn’t wait to see how much of my precious time was wasted on meetings. We’ll be doing a little bit of data clean up in this step.

def make_cpd(appointments):
    appointments['Date'] = appointments['start']
    appointments['Hours'] = (appointments['end'] - appointments['start']).dt.seconds/3600
    appointments.rename(columns={'subject':'Meeting Description'}, inplace = True)
    appointments.drop(['start','end'], axis = 1, inplace = True)
    summary = appointments.groupby('Meeting Description')['Hours'].sum()
    return summary

We’ll keep the meeting date. We can also calculate the meeting duration by subtracting “start” time from “end” time (which gives number of seconds), then divide by 3600 to get the number of hours. This is done in this line appointments['Hours'] = (appointments['end'] - appointments['start']).dt.seconds/3600

We also rename the ‘subject’ to something more meaningful “Meeting Description”. This can be done by using the pd.rename() function. It takes in an argument columns, which is a dictionary containing the current column name and the new column name we want to change to. We can change multiple column names at once. Basically just add more items into the dictionary like this: columns = {current_name_1 : new_name_1, current_name_2 : new_name_2, current_name_3 : new_name_3}. Notice this line appointments.rename(columns={'subject':'Meeting Description'}, inplace = True) didn’t assign any values? inplace = True took care of that, it instructs pandas to replace the original dataframe with the newly created dataframe. The following two lines are equivalent:

appointments.rename(columns={'subject':'Meeting Description'}, inplace = True)

appointments = appointments.rename(columns={'subject':'Meeting Description'})

The inplace argument is always sets to False by default. If omitted, nothing will happen to the original dataframe. So appointments.rename(columns={'subject':'Meeting Description'}) basically will change the column name, but the new dataframe is lost after the operation.

One thing to watch out for is that some pandas methods have the inplace argument, while some don’t. It’s hard to remember which one have it and which ones don’t, so I suggest you always look up in the official pandas documentation if in doubt.

Part 4. Putting the code together

import datetime as dt
import pandas as pd
import win32com.client

def get_calendar(begin,end):
    outlook = win32com.client.Dispatch('Outlook.Application').GetNamespace('MAPI')
    calendar = outlook.getDefaultFolder(9).Items
    calendar.IncludeRecurrences = True
    calendar.Sort('[Start]')
    restriction = "[Start] >= '" + begin.strftime('%m/%d/%Y') + "' AND [END] <= '" + end.strftime('%m/%d/%Y') + "'"
    calendar = calendar.Restrict(restriction)
    return calendar

def get_appointments(calendar,subject_kw = None,exclude_subject_kw = None, body_kw = None):
    if subject_kw == None:
        appointments = [app for app in calendar]    
    else:
        appointments = [app for app in calendar if subject_kw in app.subject]
    if exclude_subject_kw != None:
        appointments = [app for app in appointments if exclude_subject_kw not in app.subject]
    cal_subject = [app.subject for app in appointments]
    cal_start = [app.start for app in appointments]
    cal_end = [app.end for app in appointments]
    cal_body = [app.body for app in appointments]

    df = pd.DataFrame({'subject': cal_subject,
                       'start': cal_start,
                       'end': cal_end,
                       'body': cal_body})
    return df

def make_cpd(appointments):
    appointments['Date'] = appointments['start']
    appointments['Hours'] = (appointments['end'] - appointments['start']).dt.seconds/3600
    appointments.rename(columns={'subject':'Meeting Description'}, inplace = True)
    appointments.drop(['start','end'], axis = 1, inplace = True)
    summary = appointments.groupby('Meeting Description')['Hours'].sum()
    return summary

begin = dt.datetime(2020,1,1)
end = dt.datetime(2020,7,31)

cal = get_calendar(begin, end)
appointments = get_appointments(cal, subject_kw = 'weekly', exclude_subject_kw = 'Webcast')
result = make_cpd(appointments)

result.to_excel('meeting hours.xlsx')

Conclusion

I didn’t realize how easy it was to get Outlook calendar meeting data using Python until I finished this project. It was fun and also provided useful insights. During the first 7 months of 2020, my Outlook calendar logged an astonishing 188 hours of “weekly update meetings”, which converts to 1 month and 1 week. In other words, I spent 1/7 of my time just in meetings. Although meetings are definitely useful, I believe I spent most of that time on listening to others’ updates while my mind was wandering elsewhere. Hmm, maybe I should build a bot to attend those meetings for me!

9 comments

  1. Jay,

    Thanks for the excellent post. Could you please let me know if we can show the meetings that has been accepted by me instead of all meeting.

    1. Hi Santanu,

      Thanks for dropping by!

      Assuming that you have used the get_calendar() function to get a list of calendar objects. For each object you can call the .ResponseStatus attribute. The attribute values have the following meanings:
      2 – tentatively accepted
      3 – meeting accepted
      4 – meeting declined
      5 – not responded yet

      Hope that helps!

  2. doesn’t work. Gives error “AttributeError: Can only use .dt accessor with datetimelike values. Did you mean: ‘at’?”

    1. I had to change this line (am in Australia, so the time setting is dmy and not mdy

      restriction = “[Start] >= ‘” + begin.strftime(‘%d/%m/%Y’) + “‘ AND [END] <= '" +
      end.strftime('%d/%m/%Y') + "'"

      This seems to fix the above. I have no explanation.

  3. Have you ever figured out how to use the pywin32 print an outlook calendare view as a pdf or to a printer so you could atuomate having your daily calendar sitting on the printer in the morning for you?

    I couldn’t figure it out, so I asked ChatGPT to write it and that didn’t work due to the calendar_view.ViewType = “olWeekView” caused raise AttributeError(“%s.%s” % (self._username_, attr))

    any suggestions?

  4. Fyi for future people: If you want the calendar items from a calendar that is not your default calendar, use:
    calendar = outlook.getDefaultFolder(9).Folders(“Name-Of-Non-Default-Calendar”).Items
    as the second line of get_calendar()

  5. Hi,
    I would like to get emails of people from external domains also not just my company. Could you please let me know how to achieve it.

  6. I think this is great work, but the one omission in your article is that this MUST be run from a computer running MSWindows. Pywin32 will not install on a Mac or Linux platform. If I am mistaken, please let me know…

Leave a Reply

Your email address will not be published. Required fields are marked *