54: Data Processing (Pandas) and NLP (spaCy)

2023-01-06

  • This week I’ve been using Pandas to help me resolve an issue with the discount manager.
  • You might remember late last year, I successfully created a new script that would allow me to send adhoc discount code campaigns (this was after the discount manager app we built decided to stop working).
  • Well, this week, I was working on the ongoing campaign module.
  • This module will allow me to ingest a monthly employee data extract and then send a discount code to each new employee, discontinuing the codes of people who have left.
  • All pretty straightforward, but first, I needed to make sure that a campaign we were already running could be migrated from the old discount manager to this new approach.
  • I had to parse a list of everyone we sent codes to, and then I had to work out who had left and joined since the last time it ran.
  • To do this, I created a Python app on Jupiter notebooks in VS Code.
  • Anytime I need to parse, analyse or manipulate structured files (xlsx, CSV, JSON), I go straight for pandas.
  • The investment in learning Python is insignificant compared to the time I save being able to analyse the data.
  • While doing this, I wondered how I might use Python to do the same with instruct data.
  • Well, I found a library called spaCy which should help me do this through named entity recognition and entity linking.
  • I’m pretty new to this, and there’s a steep learning curve in understanding the lexicology behind written or spoken language, but as for the tools, spaCy is a joy to use.
  • With four lines of code, you can interrogate text and understand the different POS parts of speech.
  • With a few more lines of code, you can create matcher rules to find specific entities in the text.
  • I wish I had found it sooner, as it would help us create an even better search experience on our intranet project.
  • So I ended the work week on a high with a bunch of new ideas.
  • I’ll document the projects as I go, so if anyone’s interested, they can play along too.

  • Reading: Agency by William Gibson (56% Complete)
  • Wordle: 567 4/6

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s