I Wrote a Python Web Scraper That Sends Me Text Messages About Job Postings

We will start with the python file first and we need to import some things.

# scrape.

pyimport osimport requestsThe first module we import is os.

This gives our Python script the ability to interact with the operating system it is being run on — because the script is AppleScript, it will have to interact with OS X.

The second module import, requests, is an Apache2 Licensed HTTP library that lets us access all the HTTP verbs.

This is useful because it does things like automatically adding query strings to URLs and adding parameters.

Next we are going to import the pip packages we installed as well as one called datetime that comes with Python.

# scrape.

pyfrom BeautifulSoup import BeautifulSoupfrom apscheduler.

schedulers.

blocking import BlockingSchedulerfrom datetime import datetimesch = BlockingScheduler()We use BeautifulSoup to pull information from the HTML page we want to scrape.

apscheduler is what allows us to run the script at an interval of, say, every 24 hours for daily updates.

datetime lets us access the current time and date during script execution.

The last line lets us access the apscheduler module through the sch variable.

We are ready to start setting up our scraper!.This is what the code looks like so far:Now we write the main() function that we will use with the BlockingScheduler to set the scraper to run at an interval of our choosing.

# scrape.

pydef main():return;Inside main is where we set the page we are scraping, the response from the get request to the url, and the HTML of the page that comes in through the response.

I just went to indeed.

com and searched for “web developer”, with Denver, CO as my filter.

url = 'https://www.

indeed.

com/jobs?q=web%20developer&l=Denver%2C%20CO&vjk=0c0f7c56b3d79b4c'response = requests.

get(url)html = response.

contentWe need the HTML in a readable format (or at least readable for Python) in order to sort through it and pick out the data that we need.

So lets throw it into the BeautifulSoup and search through the HTML!soup = BeautifulSoup(html)matches = soup.

findAll(name='div', attrs={'class': 'title'})Now our soup variable holds the beautiful HTML and we need to sort through it.

As you can see in the screenshot above, using the inspect element of chrome dev tools (CMD + option + i) I was able to determine that the class name of Job title posting was simply called ‘title’.

We then search through the HTML with soup.

findAll to find all the HTML elements with a name='div', and an element attribute of class that has the value of title: attrs={'class': 'title'}).

This is what our code looks like so far in the Python file:When doing this I was console logging a lot, but I added a print at the bottom of main so if you are following around you can see what we have so far.

To see the output, navigate to the directory where you created the two script files and type python scrape.

py — you should get a list of all the job titles of the postings on the first page of indeed.

com.

It will look something like this:Now we need to set it up so we get a text message when there is a relevant job posting for us!.Navigate to the sendMessage.

scpt file in the same directory as the Python script and we will get started with some simple AppleScript.

# sendMessage.

scpton run {targetPhoneNumber, targetMessageToSend}end runThis on run function takes in two arguments, a phone number and a message, and will be included when we determine if we are going to make a text message from the py script (this is why we had import 'os' at the top of the Python file — this will make more sense later).

Now we are going to tell the OSx application “Messages” that we want to send a text message to a phone number with a “service type” of iMessage.

The Messages application needs to know the service type through which you are sending your message, so we tell it that the service of the targetPhoneNumber (that was passed into the function) is iMessage.

We then set the targetMessageToSend, that was also passed into the function, to a variable.

Finally, we tell Messages to send the message to the phone number we provided.

Below is what our sendMessage.

scpt looks like — it’s done!.Let’s go back to our Python script and tell it when to send a text message.

Remember when we performed a .

findAll on the beautiful HTML and got back a list of job titles?.All we need to do now is sort through that list to see if any part of the job title matches junior or jr for junior developer.

We will perform this logic inside our main function.

We iterate the job titles with a for loop and see if the strings “Junior” or “Jr” exist.

If either or both do, then we use the os module to perform terminal operations from the Python script.

os.

system("osascript sendMessage.

scpt YOUR_NUMBER_HERE ‘YOUR_MESSAGE_HERE' “) Replace the placeholders with your phone number and the message you want to send (I replaced the message with the original url of the page, so I could open it directly from my text message).

These will be the arguments that we passed into the AppleScript file (if you remember making those).

Then, at the end of each, if we have at least one match we break the loop — to ensure we don’t get spammed with texts if there are multiple matches for Junior or Jr on the page.

Here is what our scraper file looks like so far:The last thing to do is set an interval for it to run on.

This is where BlockingScheduler and our sch variable comes in handy.

For the main function, after the return, we add these two lines of code:sch.

add_job(main, 'interval', seconds=3)print('Press Ctrl+{0} to exit'.

format('Break' if os.

name == 'nt' else 'C'))The first line adds a job for the scheduler to run the main() function on an 'interval' of seconds=3.

3 seconds is just for the purposes of testing — hours=12 and minutes=10 with different integer values are also accepted parameters.

We have a simple try/except statement that starts the scheduler!try: sch.

start()except (KeyboardInterrupt, SystemExit): passOK…I know that Indeed already offers a service like this and probably one that also includes emails!.Still, it was a lot of fun learning and building this little script.

My journey as a developer has only just begun, but, if this experience was anything to go by, I could not be more excited to continue it!Here is our finished scraper.

py file.

That’s it!.You can now type python scraper.

py into your terminal after changing the values and see your scraper send you texts.

If you wanted to change the webpage you were scraping and what you were looking for, the BeautifulSoup documentation is here.

The logic is all there, you just might have to do more sorting than I did.

Once again, I had a lot of fun doing this.

I hope this post was informative and perhaps useful to someone out there.

ResourcesThe GitHub Project for this piece.

. More details

Leave a Reply