Email in Python

Email is a good example of a client-server system, and one that gets used millions of times each minute. The email program on a PC is the client, and al­lows a user to enter text messages, specify destinations, attach images, and all of the features expected by such a program. This client packages the email message (data) according to the Simple Message Transfer Protocol (SMTP) and sends that to another computer on the Internet, the email server. An email user must have an account on the server for this to work so they can be identified and the user can receive replies. The process is as follows: log into the email server, then send the SMTP message to the email server program on that server. Thus, the client side of the contract is to create a properly formatted message, to log into the server properly, and pass the message to it.

Now the server does the work. Given the destination of the message, it searches for the server that is connected to that destination. For example, given the address xyz@gmail.com, the server for gmail.com is located. Then the email message is sent across the network to that server. The server software at that end reads the message and places it into the mailbox, which is really just a directory on a disk drive connected to the server, for the specified use xyz. The mail mes­sage is essentially a text file at this point.

This description is simplified but essentially accurate, and describes what has to be done by a program that is supposed to send an email message. The Python module that permits the sending of email implements the protocol and of­fers the programmer ways to specify the parameters, like the destination and the message. The interface is implemented as a set of functions. The library needed for this is smtplib, a part of the standard Python system.

Example: Sending an email

Sending an email message starts with establishing a connection between the client computer and the user’s mail server, the one on which they have an account (user name and password). For the purposes here, a Gmail (Google) server is used. The email accounts in the example are also Gmail ones, and these can be had for free from Google.

The program must declare smtplib as an imported module. The sending ad­dress and the receiving address are the same in this example, but this is just a test. Normally, this will not be the situation. The email address is the user ID for Gmail authentication and the password is defined by the user. These are all strings.

import smtplib

LOGIN = yourloginID      # Login User ID for Gmail, string

PASSWD = yourpassword    # Login password for Gmail, string

sndr = pythontextbook@gmail.com      # Sender’s email address

rcvr = pythontextbook@gmail.com      # Receiver’s email ad­dress

Part of the SMTP scheme is a syntax for email messages. There is a header at the beginning that specifies the sender, receiver, and subject of the message. These are used to format the message, not to route it—the receiver address is specified later. A simple such message looks like this:

From: user me@gmail.com

To: user you@gmail.com

Subject: Just a message

A string must be constructed that contains this information:

msgt = “From: user me@gmail.com\n”

msgt = msgt + “To: user you@gmail.com\n”

msgt = msgt + “Subject: Just a message\n”

msgt = msgt + “\n”

Now the body of the message is attached to this string. This is the part of the email that is important to the sender:

msgt = msgt + “Attention: This message was sent by Python!\n”

The string variable msgt now holds the whole message. This message is in the format defined by the Multipurpose Internet Mail Extensions (MIME) standard. The next step for the program is to try to establish a connection with the sender’s email server. For this, the smtp module is needed, specifically the SMTP() function. It is called, passing the name of the user’s email server as a parameter, and it returns a variable that references that server. In this example, that variable is named server:

server = smtplib.SMTP(‘smtp.gmail.com’)

If it is not possible to connect to the server for some reason, then an error will occur. It is therefore a good idea to place this in a try-except block:

try:

server = smtplib.SMTP(‘smtp.gmail.com’)

except:

print (“Error occurred. Can’t connect”)

else:

Now comes the complexity that Gmail and some other servers introduce. What happened after the call to smtplib.SMTP() is that a communications ses­sion has been opened up. There is now an active connection between the client computer and the server at smtp.gmail.com. Some servers demand a level of se­curity that ensures that other parties cannot modify or even read the message. This is accomplished using a protocol named Transport Layer Security (TLS), the details of which are not completely relevant because the modules take care of it. However, to send data to smtp.gmail.com, the server must be told to begin using TLS:

server.starttls()

Now the user must be authenticated using their ID and password:

server.login(LOGIN,PASSWD)

Only now can a message be sent, and only if the login ID and password are correct. The sender is the string sndr, the recipient is rcvr, and the message is msgt:

server.sendmail(sndr, rcvr, msgt)

Now that the message has been sent, it is time to close the session. Logging off of the server is done as follows:

server.quit()

This program sends one email, but it can be easily modified to send many emails, one after the other. It can be modified to read the message from the keyboard, or perform any of the functions of a typical email-sending program (Exercise 1).

The module email can be invoked to format the message in MIME form. The function MIMEText(s) converts the message string s into an internal form, which is a MIME message. Fields like the subject and sender can be added to the message, and then it is sent as was done before. For example,

import smtplib

from email.mime.text import MIMEText

LOGIN = yourloginID

PASSWD = yourpassword

fp = open (“message.txt”, “r”)    # Read the message

                                  # from a file

mtest = fp.read()

# Or: simply use a string

#mtest = “A message from Python: Merry Christmas.”

fp.close()

msg = MIMEText (mtest)                # Create a MIME string

sndr = pythontextbook@gmail.com       # Sender’s email

rcvr = pythontextbook@gmail.com       # Recipient’s email

msg[‘Subject’] = ‘Mail from Python’   # Add Subject to the message

msg[‘From’] = sndr                    # Add sender to the message

msg[‘To’] = rcvr                      # Add recipent to the

                                      # message

# Send the message using Google’s SMTP server, as before

s = smtplib.SMTP(‘smtp.gmail.com’)    # localhost could work

s.starttls()

s.login (LOGIN, PASSWD)

s.send message(msg)

s.quit()

Using MIMEText() to create the message avoids having to format it cor­rectly using basic string operations.

1. Reading email

Reading email is more complicated than writing it. The content of an email is often a surprise, and so a reader must be prepared to parse anything that might be sent. There can be multiple mailboxes: which mailbox will be looked at? There are usually many messages in a mailbox: how can they be distinguished? In ad­dition, the protocol for retrieving mail from a server is different from that used to send it. There are two competing protocols: POP and IMAP.

The Post Office Protocol (POP) is the older of the two schemes, although it has been updated a few times. It certainly allows the basic requirements of a mail reader, which is to download and delete a message in a remote mailbox (i.e., on the server). The Internet Message Access Protocol (IMAP) is intended for use by many email clients, and so messages tend not to be deleted until that is requested.

When setting up an email client, one of these protocols usually has to be speci­fied, and then it will be used from then on. The example here uses IMAP.

2. Example: Display the Subject Headers for Emails in the Inbox

An outline for the process of reading email is sketched on the right side of Figure 13.1. Reading email uses a different module that was used to send email: imaplib, for reading from an IMAP server. The function names are different from those in smtplib, but the purpose of some of them is the same. The first three steps in reading email are as follows:

import imaplib

server = ‘imap.gmail.com’          # Gmail’s IMAP server

USER = pythontextbook@gmail.com    # User

ID PASSWORD = “password”           # Mask this password

EMAIL FOLDER = “Inbox”

mbox = imaplib.IMAP4 SSL(server)   # Connect to the server

mbox.login(USER, PASSWORD)         # Authenticate (log in)

The next step is to select a mailbox to read. Each has a name, and is really just a directory someplace. The variable mbox is a class instance of a class named imaplib.IMAP4SSL, the details of which can be found in many places, including the Internet. It has a method named select() that allows the examination of a mail­box, given its name (a string). The string is a variable named EMAIL_FOLDER, which contains “Inbox,” and the call to select() that essentially opens the inbox is

z = mbox.select(EMAIL FOLDER)

The return value is a tuple. The first element indicates success or failure, and if z[0] contains the string “OK,” then the mailbox is open. The usual alternative is “NO.” The second element of the tuple indicates how many messages there are, but it is in an odd format. If there are 2 messages, as in the example, this string is b ’2’; if there were 3 messages it would be b ’3 ’; and so on. These are called mes­sage sequence numbers.

Having opened the mailbox, the next step is to read it and extract the mes­sages. The protocol requires that the mailbox be searched for the messages that are wanted. The imaplib.IMAP4 SSL class offers the search() method for this, the simplest form being

mbox.search(None, “ALL”)

which returns all of the messages in the mailbox. IMAP provides search func­tionality, and all this method does is connect to it, which is why it seems awkward to use. The first parameter specifies a character set, and None allows it to default to a general value. The second parameter specifies a search criterion as a string. There are dozens of parameters that can be used here and the documentation for IMAP should be examined in detail for solutions to specific problems. However, some of the more useful tags include

ANSWERED: Messages that have been answered

BCC <string>: Messages with a specific string in the BCC field

BEFORE <date>: Messages whose date (not time) is earlier than the speci­fied one

HEADER <field-name> <string>: A specified field in the header contains the string

SUBJECT <string>: Messages that contain the specified string in the SUB­JECT field

TO <string>: Messages that contain the specified string in the TO field

UNSEEN: Messages that do not have the \Seen flag set

A call to search() that looks for the text “Python” in the subject line is

mbox.search(None, “SUBJECT Python”)

The search() function returns a tuple again, where the first component is a status string (i.e., “OK,” “NO,” and “BAD”) and the second is a list of messages satisfying the search criteria in the same format as before. If the second message if the only match, this string will be b’2.’ If the first three match it will be b’1 2 3.’

Finally, the messages are read, or fetched. The imaplib.IMAP4_SSL class has a fetch() method to do this, and it again takes some odd parameters. What a programmer thinks of the interface or the API or, in other words, the contract, is not important. What must be done is to satisfy the requirements and accept the data as it is offered. The fetch() method accepts two parameters: the first is the indication of which message is desired. The first message is b ’1 ’, the second is b’2’, and so on. The second parameter is an indicator of what it is that should be returned. The header? If so, pass (RFC822.HEADER) as the parameter. Why?

Because they ask for it. RFC822 is the name of a protocol. If the email body is wanted, then pass (RFC822.TEXT). A short list of possibilities is

RFC822                   – Everything

RFC822.HEADER   – No body, header only

RFC822.TEXT         – Body only

RFC822.SIZE          – Message size

UID                          – Message identifier

Multiple of these specifiers can be passed. For example,

mbox.fetch(num, ‘(UID RFC822.TEXT RFC822.HEADER)’)

returns a tuple having three parts: the ID, the body, and the header. The head­er tends to be exceptionally long, 40 lines or so. For this example, the only part of the header that is interesting is the “Subject” part. Fields in the header are separated by the characters “\r\n,” so they are easy to extract in a call to split(). Eliminating the header data for a moment, the call

(env, data) = mbox.fetch(num, ‘(UID RFC822.TEXT)’)

results in a tuple that has an “envelope” that should indicate “OK” (the env vari­able). The data part is a string that contains the UID and the text body of the message. For example,

[(b’2 (UID 22 RFC822.TEXT {718}’, b”Got a collection of old 45’s for sale. Contact me.\r\n\r\n– \r\n”), b’)’]

This says that this is message 2 and shows the text of that message.

This example is supposed to print all of the subject headers in this mailbox. The call to fetch() should extract the header only:

(env, data) = mbox.fetch(num, ‘(RFC822.HEADER)’)

The details of IMAP are complex enough that it is easy to forget what the original task was, which was to print the subject lines from the messages in the mailbox. All of the relevant methods have been described and completing the program is possible. The entire program is as follows:

import imaplib

server = ‘imap.gmail.com’           # IMAP Server

USER = “pythontextbook@gmail.com”   # USER ID

PASSWORD = “”                       # Mask this password

EMAIL FOLDER = “Inbox”              # Which mailbox?

mbox = imaplib.IMAP4 SSL(server)    # Connect

mbox.login(USER, PASSWORD)          # Authenticate

env, data = mbox.select(EMAIL FOLDER) # Select the mailbox

if env == ‘OK’:                       # Did it work?

print (“Printing subject headers:   “, EMAIL_FOLDER)

env, data = mbox.search(None, “ALL”)  # Select the

    if env != ‘OK’:                       # messages wanted.

print (“No messages.”, env) exit()

for num in data[0].split():      # For each selected

                                 # message b’1 2 3 …’

(env, data) = mbox.fetch(num, ‘(RFC822.HEADER)’)

                                 # Read it

if env != ‘OK’:

print (“ERROR getting message”, num, “,   “, env)

break

s = str(data[0][1])             # Look for the string

                                # “Subject” in the header

k = s.find(“Subject”)

if (k>=0):          # Found it?
s = s[k:]           # Extract the string
# to the next ‘\r’


k = s.find(‘\\r’)

s = s[:k]

print (s)

mbox.close() else:

print (“No such mailbox as “, EMAIL_FOLDER) mbox.logout()

The typical output would be as follows:

Printing subject headers: Inbox

Subject: Contents of Chapter 13

Subject: 45 RPM

Subject: another email

The point of this section was to demonstrate how a Python program, or any program for that matter, must comply with external specifications when interfac­ing with sophisticated software systems, and to introduce the concept of a protocol, a contract between developers. A program that can send email is useful by itself.

Source: Parker James R. (2021), Python: An Introduction to Programming, Mercury Learning and Information; Second edition.

Leave a Reply

Your email address will not be published. Required fields are marked *