
We’ll teach you how to use Python to monitor a webpage on the Raspberry Pi in this project.
This Python script will monitor a website on your Raspberry Pi and tell you when it goes down or changes.
This is accomplished by saving a basic replica of the webpage locally and comparing the changes.
Because this website monitor is so basic, it should work just fine on a Raspberry Pi Zero.
We’ll teach you how to write your own script to monitor a website throughout this course. You should be able to adapt the script to your individual requirements using this information.
You can skip to the part headed “Running your Raspberry Pi Website Monitor Periodically” if you don’t want to learn how the code works. However, in order for email notifications to operate, you’ll need to make some changes to the code.
Because it doesn’t require an interface, this project is ideal for a headless Raspberry Pi.
Despite the fact that this article concentrates on the Raspberry Pi, this code will run on any device that supports Python 3. As a result, you may execute this script on your Windows device if you like.
Equipment
The equipment we needed to build up a script to monitor URLs on our Raspberry Pi is listed below.
- Raspberry Pi
- Micro SD Card
- Power Supply
- Ethernet Cable or Wi-Fi
- Optional
- Raspberry Pi Case
- HDMI Cable
- USB Keyboard
- USB Mouse
Installing the Website Monitor on your Raspberry Pi
sudo apt update
sudo apt upgrade -y
sudo apt install python3 python3-pip
pip3 install requests beautifulsoup4 lxml
Using the Website Monitor on your Raspberry Pi
nano websitemonitor.py
Python code for a simple website monitor
import os
import sys
import requests
Creating the function has_website_changed()
def has_website_changed(website_url, website_name):
Headers for our Python Request Definition
headers = {
'User-Agent': 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; PIWEBMON)',
'Cache-Control': 'no-cache'
}
Using the Website to Make a Request
response = requests.get(website_url, headers=headers)
if (response.status_code < 200 or response.status_code > 299):
return -1
response_text = response.text
cache_filename = website_name + "_cache.txt"
Cache Creation on a New Website
if not os.path.exists(cache_filename):
file_handle = open(cache_filename, "w")
file_handle.write(response_text)
file_handle.close()
return 0
file_handle = open(cache_filename, "r+")
previous_response_text = file_handle.read()
file_handle.seek(0)
if response_text == previous_response_text:
file_handle.close()
return 0
else:
file_handle.truncate()
file_handle.write(response_text)
file_handle.close()
return 1
main() function creation
def main():
Examining the Website for Changes
website_status = has_website_changed(sys.argv[1], sys.argv[2])
Response to printing based on website status
if website_status == -1:
print("Non 2XX response while fetching")
elif website_status == 0:
print("Website is the same")
elif website_status == 1:
print("Website has changed")
The Basic Code in its Final Form
import os
import sys
import requests
def has_website_changed(website_url, website_name):
headers = {
'User-Agent': 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; PIWEBMON)',
'Cache-Control': 'no-cache'
}
response = requests.get(website_url, headers=headers)
if (response.status_code < 200 or response.status_code > 299):
return -1
response_text = response.text
cache_filename = website_name + "_cache.txt"
if not os.path.exists(cache_filename):
file_handle = open(cache_filename, "w")
file_handle.write(response_text)
file_handle.close()
return 0
file_handle = open(cache_filename, "r+")
previous_response_text = file_handle.read()
file_handle.seek(0)
if response_text == previous_response_text:
file_handle.close()
return 0
else:
file_handle.truncate()
file_handle.write(response_text)
file_handle.close()
return 1
def main():
website_status = has_website_changed(sys.argv[1], sys.argv[2])
if website_status == -1:
print("Non 2XX response while fetching")
elif website_status == 0:
print("Website is the same")
elif website_status == 1:
print("Website has changed")
if __name__ == "__main__":
main()
On our Raspberry Pi, we’re putting the Basic Website Monitor to the test.
chmod +x websitemonitor.py
python3 websitemonitor.py https://pimylifeup.com/ pimylifeup
Using beautifulsoup to improve the Raspberry Pi Website Monitor
from bs4 import BeautifulSoup
Creating a New cleanup html() Method
def cleanup_html(html):
Create a new BeautifulSoup object.
soup = BeautifulSoup(html, features="lxml")
Using BeautifulSoup to clean up the HTML
for s in soup.select('script'):
s.extract()
for s in soup.select('style'):
s.extract()
for s in soup.select('meta'):
s.extract()
return str(soup)
Cleaning up the HTML Response that was retrieved
response_text = response.text
response_text = cleanup_html(response.text)
import os
import sys
import requests
from bs4 import BeautifulSoup
def cleanup_html(html):
soup = BeautifulSoup(html, features="lxml")
for s in soup.select('script'):
s.extract()
for s in soup.select('style'):
s.extract()
for s in soup.select('meta'):
s.extract()
return str(soup)
def has_website_changed(website_url, website_name):
headers = {
'User-Agent': 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; PIWEBMON)',
'Cache-Control': 'no-cache'
}
response = requests.get(website_url, headers=headers)
if (response.status_code < 200 or response.status_code > 299):
return -1
response_text = cleanup_html(response.text)
cache_filename = website_name + "_cache.txt"
if not os.path.exists(cache_filename):
file_handle = open(cache_filename, "w")
file_handle.write(response_text)
file_handle.close()
return 0
file_handle = open(cache_filename, "r+")
previous_response_text = file_handle.read()
file_handle.seek(0)
if response_text == previous_response_text:
file_handle.close()
return 0
else:
file_handle.truncate()
file_handle.write(response_text)
file_handle.close()
return 1
def main():
website_status = has_website_changed(sys.argv[1], sys.argv[2])
if website_status == -1:
print("Non 2XX response while fetching")
elif website_status == 0:
print("Website is the same")
elif website_status == 1:
print("Website has changed")
if __name__ == "__main__":
main()
The Raspberry Pi Website Monitor now has email support.
Importing a New File
import smtplib
Defining Constants for Email Data Storage
SMTP USER s
SMTP_USER='example@gmail.com'
SMTP PASSWORD
SMTP_PASSWORD='PASSWORD'
SMTP HOST
SMTP_HOST='smtp.gmail.com'
SMTP PORT
SMTP_PORT=465
SMTP SSL
SMTP_SSL=True
SMTP FROM EMAIL
SMTP_FROM_EMAIL='example@gmail.com'
SMTP TO EMAIL
SMTP_TO_EMAIL='sendto@gmail.com'
Our email notification() function has been written.
def email_notification(subject, message):
Establishing an SMTP Connection
if (SMTP_SSL):
smtp_server = smtplib.SMTP_SSL(SMTP_HOST, SMTP_PORT)
else:
smtp_server = smtplib.SMTP(SMTP_HOST, SMTP_PORT)
Accessing the SMTP Server
smtp_server.ehlo()
smtp_server.login(SMTP_USER, SMTP_PASSWORD)
Creating an Email Format
email_text =
"""From: %s
To: %s
Subject: %s
%s
""" % (SMTP_FROM_EMAIL, SMTP_TO_EMAIL, subject, message)
Email transmission
smtp_server.sendmail(SMTP_FROM_EMAIL, SMTP_TO_EMAIL, email_text)
smtp_server.close()
print("Non 2XX response while fetching")
email_notification("An Error has Occurred", "Error While Fetching " + sys.argv[1])
print("Website has changed")
email_notification("A Change has Occurred", sys.argv[1] + " has changed.")
The Code in Its Final Form
import os
import sys
import requests
from bs4 import BeautifulSoup
import smtplib
SMTP_USER='example@gmail.com'
SMTP_PASSWORD='password'
SMTP_HOST='smtp.gmail.com'
SMTP_PORT='465'
SMTP_SSL=True
SMTP_FROM_EMAIL='example@gmail.com'
SMTP_TO_EMAIL='sendto@gmail.com'
def email_notification(subject, message):
"""Send an email notification.
message - The message to send as the body of the email.
"""
if (SMTP_SSL):
smtp_server = smtplib.SMTP_SSL(SMTP_HOST, SMTP_PORT)
else:
smtp_server = smtplib.SMTP(SMTP_HOST, SMTP_PORT)
smtp_server.ehlo()
smtp_server.login(SMTP_USER, SMTP_PASSWORD)
email_text =
"""From: %s
To: %s
Subject: %s
%s
""" % (SMTP_FROM_EMAIL, SMTP_TO_EMAIL, subject, message)
smtp_server.sendmail(SMTP_FROM_EMAIL, SMTP_TO_EMAIL, email_text)
smtp_server.close()
def cleanup_html(html):
"""Cleanup the HTML content.
html - A string containg HTML.
"""
soup = BeautifulSoup(html, features="lxml")
for s in soup.select('script'):
s.extract()
for s in soup.select('style'):
s.extract()
for s in soup.select('meta'):
s.extract()
return str(soup)
def has_website_changed(website_url, website_name):
"""Check if a website has changed since the last request.
website_url - URL that you want to monitor for changes.
website_name - Name used for the cache file.
"""
headers = {
'User-Agent': 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; PIWEBMON)',
'Cache-Control': 'no-cache'
}
response = requests.get(website_url, headers=headers)
if (response.status_code < 200 or response.status_code > 299):
return -1
response_text = cleanup_html(response.text)
cache_filename = website_name + "_cache.txt"
if not os.path.exists(cache_filename):
file_handle = open(cache_filename, "w")
file_handle.write(response_text)
file_handle.close()
return 0
file_handle = open(cache_filename, "r+")
previous_response_text = file_handle.read()
file_handle.seek(0)
if response_text == previous_response_text:
file_handle.close()
return 0
else:
file_handle.truncate()
file_handle.write(response_text)
file_handle.close()
return 1
def main():
"""Check if the passed in website has changed."""
website_status = has_website_changed(sys.argv[1], sys.argv[2])
if website_status == -1:
email_notification("An Error has Occurred", "Error While Fetching " + sys.argv[1])
print("Non 2XX response while fetching")
elif website_status == 0:
print("Website is the same")
elif website_status == 1:
email_notification("A Change has Occurred", sys.argv[1] + " has changed.")
print("Website has changed")
if __name__ == "__main__":
main()
Periodically run your Raspberry Pi Website Monitor
crontab -e
* * * * * /usr/bin/python3 /home/pi/websitemonitor.py WEBSITEURL CACHENAME
COMMENTS