Reading and Writing Text Files in Python

Reading and writing text files in python is a fundamental skill for developers, engineers and data scientists.
Andrew Wood  •   27 May 2022
Andrew Wood  •   Last Updated: 27 May 2022
Reading and writing text files in python is a fundamental skill for developers, engineers and data scientists.

Working with text files in the python environment is a common task performed when working on engineering, data analysis or scientific applications. Python makes this task a breeze to accomplish once you know how. This tutorial will go through some of the basic file handling capabilities that comes standard with a clean Python install.

Text and Binary Files

Python can read/write both ordinary text files as well as binary files. We will only consider text files in this article as this is a more common format to work with.

  • A text file is human readable file where every line terminates with an End of Line (EOL) character. This is the character tells the computer what to do when it gets to the end of a line in the file. The most commonly used EOL character is the newline which is written \n. All text following a newline will render on a new line.

Input:

astring = "hello my name is George. I am a single line kind of guy!"
print(astring)
print("\n")
# Evrything after the newline character \n will be written on a new line.
bstring = "hello my name is George.\nI am now on a new line."
print(bstring)

Output:

"Hello my name is George. I am a single line kind of guy!"

"Hello my name is George.
I am now on a new line."

Opening and Closing a File

All files must be explicitly opened and closed when working with them in the Python environment. This can be performed using the open() and close() functions.

f = open(filepath,'r')
# perform file handling operations
f.close()

The most common way to do this is to use the with open statement which automatically closes the file once the file handling is complete. You do not need to include a close() method when using with open.

In practice you will (almost) always use the with open statement when working with a single text file.

with open(filepath,'r') as infile:
    # perform your file handling operations

Let’s take a closer look at the open function.

open(file, mode='r', buffering=- 1, encoding=None, errors=None, newline=None, closefd=True, opener=None)

The two most important parameters in the open function are file and mode, and these are the parameters we'll focus on.

File is a string and gives the path to the text file you wish to work with. If the python file that you are working with is in the same folder as the text file, then you only need to give the name of that file. However, as soon as the text file is in a different location to the python script you need to explicitly specify it's location. This is often performed using the os module which comes with a default install of Python.

mode is an optional parameter which tells python what permissions to give to the script opening the file. It defaults to read-only 'r'.

Mode Description
'r' Open the file for read only 
'w' Open the file for writing (will overwrite existing content).
'a' Open the file and append text to the end of it.
'b' Binary Mode
't' Text Mode (default)

To open a file named “sample-file.txt” that we do not wish to write to (read only), it is best to use the with open syntax:

with open("sample-file.txt",'r') as f:
# Do something

The mode here is specified as just ‘r’ and not ‘rt’ as the text mode is default. If you wanted to open a binary file to read-only then the mode would be written ‘rb’.

Extracting (Reading) Data from the File

There are three methods used to read the contents of a text file.

  • read() this will return the data in the text file as a single string of characters.
  • readline() this will read a line in the file and return it as a string. It will only read a single line.
  • readlines() reads all the lines in the text file and will return each line as a string within a list. This method looks for EOL characters to differentiate items in the list. readlines() is most often implicitly used in a for loop to extract all the data from the text file.

We’ll demonstrate the three different methods used with an example. Our example text file is the first few lines from the Zen of Python.

sample textfile of four sentences.

If you open this file up in an editor such as Notepad ++ you can actually see the EOL characters displayed. Python will read and store these characters as a part of the string unless you specify otherwise.

Example using read() 

The read() method will return a single string of characters which is mostly useful if you have a small file that you wish to read. As soon as your data spans multiple rows it is probably better to use readlines() rather than read().

with open("sample-textfile.txt",'r') as f:
    string = f.read()
"Beautiful is better than ugly.\nExplicit is better than implicit.\nSimple is better than complex.\nComplex is better than complicated."

Example using readline()

Readline() will return the first line in the text file. The first line is defined as the first character in the file up to the first EOL character.

with open("sample-textfile.txt",'r') as f:
    string = f.readline()
"Beautiful is better than ugly.\n"

Example using readlines() 

Readlines() runs through each line in the file and stores it along with the associated EOL characters as a set of strings contained within a list. Each line may then be accessed by referring to the applicable index in the list.

with open("sample-textfile.txt",'r') as infile:
    string = infile.readlines()

print(string)
string = ['Beautiful is better than ugly.\n', 'Explicit is better than implicit.\n', 'Simple is better than complex.\n', 'Complex is better than complicated.']

The newline characters are preserved in the resulting strings housed in the list. Most of the time it is preferable to remove the characters from the list. This can be accomplished using the strip() function to remove whitespace and characters as we'll demonstrate below.

For loop and strip()

The open() function returns an object which is iterable, meaning that you can access the various lines in the file using a for loop. Adding the strip() function to the loop removes the EOL characters, leaving a nicely formatted list that you can then access in your Python code.

This is generally the preferred method to import a text file into your Python code.

You can replace strip() with lstrip() or rstrip() if you only wish to remove whitespace and characters from the beginning or end of the string respectively.

outlist = []
with open("sample-textfile.txt",'r') as infile:
    for line in infile:
        line = line.strip()
        outlist.append(line)
    
print(outlist)

print("This output is a: "+ str(type(outlist)))
outlist = ['Beautiful is better than ugly.', 'Explicit is better than implicit.', 'Simple is better than complex.', 'Complex is better than complicated.']

This output is a: <class 'list'>

Working with Text Files Containing Numbers

It is worth remembering that the methods shown above all result in a string output (either a single string or a list of strings depending on which read method is used).

In many engineering or data science applications, you may receive an input from a text file and then need to perform some calculation or analysis on that input. If this is the case, then you must remember to change the output datatype into the applicable numerical format or else you won't be able to perform any mathematical operations on the text input data.

Suppose we have a text file that contains two numbers that we wish to multiply:

Before we can work with the two numbers we first have to (i) read this into our Python script, (ii) extract the numbers from the resulting strings, (iii) convert the string number into a numerical datatype, and then (iv) perform the multiplication.

In the example before we have created a simple function readinputfile() which takes a filepath string and the mode as inputs and returns a list of strings where end of line characters have been removed.

The numbers are extracted from the list, and then converted into integers before being multiplied.

import os

calcinputfile = "calculation-input.txt"
inputfolder = "inputfiles" 
# input file is stored in a folder which is at the same level as the script
calcfilepath = os.path.join(inputfolder,calcinputfile)

def readinputfile(filepath,mode='r'):
    filelist = []
    with open(filepath,'r') as infile:
        for line in infile:
            line = line.strip()
            filelist.append(line)
    return filelist
        
datalist = readinputfile(calcfilepath)

print(datalist)
print("\n")
print("This data stored in the list is a: "+ str(type(datalist[0])))
print("\n")

a = datalist[0][2:] # extract the number from the string
print("a = "+a + " (data type:"+ str(type(a))+ ")")
b = datalist[1][2:]
print("b = "+b + " (data type:"+ str(type(b))+ ")")

# convert the strings to integers
a = int(a)
b = int(b)
print("\n")
print("a now has data type: " + str(type(a))+ ")")
print("b now has data type: " + str(type(b))+ ")")

c = a*b
print("\n")
print("c = "+ str(c))
datalist = ['a:23', 'b:12'] # datalist is a list of strings

This data stored in the list is a: <class 'str'>

# the numbers are still of the type string
"a = 23 (data type:<class 'str'>)"
"b = 12 (data type:<class 'str'>)"

# convert the strings to integers using int()
"a now has data type: <class 'int'>)"
"b now has data type: <class 'int'>)"

# Final result
"c = 276"

Writing to a Text File

Writing to a text file follows a very similar methodology to that of reading files. Use a with open statement but now change the mode to 'w' to write a new file or 'a' to append onto an existing file.

There are two write functions that you can use:

  • write() takes a string as an input and will write out a single line.
  • writelines() takes a list of strings as an input will write out each string in the list. 

If you do not specify end of line characters then each string will be added directly after the last.

Example using write()

import os

outfilename = "sample-output.txt"
outputfolder = "output"
filepath = os.path.join(outputfolder,outfilename)

with open(filepath,'w') as f:
    f.write("Hello World\n")
    f.write("How are you doing?\n")

The resulting output file will consit of two lines "Hello World" and "How are you doing?".

Example using writelines() and Append

We can append some additional text onto the sample-output.txt file making use of the mode='a' attribute. We have created a list of strings (without forgetting to add newlines 'n' where necessary) which will be added using the writelines() method.

import os

outputlist = ['Hello World\n',"How are you?\n","My name is Sue\n"]

outfilename = "sample-output.txt"
outputfolder = "output"
filepath = os.path.join(outputfolder,outfilename)

with open(filepath,'a') as f: # note we are appending to existing file.
    f.writelines(outputlist)

The final output includes the original lines plus those appended from outputlist.

Summary

Reading and writing to text files in Python is fairly straight forward when you keep the following in mind:

  • Text files must be opened and closed when working with them. The easiest way is to use a with open statement which ensures the file is closed when the loop terminates.
  • The open() method creates an iterable object so that the quickest way to extract every line of the file is to iterate through using a for loop. If you need to remove the line characters or any extra whitespace then use strip(), lstrip() or rstrip() as necessary.
  • The open() method requires that a mode is set to tell the function whether the text file is read-only ('r'), writable ('w') or writable through appending to the end of the file ('a').
  • If you are writing to a text file don't forget to add the necessary end of line characters (newline:'\n') to format the file as you require.
  • When writing to a file you will either use writeline() for a single line or writelines() to write out every line in a list of strings.
# Read text file contents into a list
outlist = []
with open("textfile_path",'r') as f:
    for line in f:
        line = line.strip()
        outlist.append(line)
    
print(outlist)

# write out a textfile with the contents of a list.
list_to_write = ['Line one','Line two','Line three']
list_to_write = [i+'\n' for i in list_to_write]

with open("textfile_path",'w') as f:
    f.writelines(list_to_write)

Share this
Comments
Canard Analytics Founder. Python development, data nerd, aerospace engineering and general aviation.
Profile picture of andreww
Share Article

Looking for a partner on your next project?

Contact Us