57: Pandas – Filtering Records

2023-01-11

Today I had one of those, aha! moments with Pandas which has made my life a lot simpler.

Here’s the story.

I have an excel file which contains a list of discount codes, some have been used, and some have not. I need to go through the list of codes and find out the following:

  • Have we assigned a code to anyone who has now left the business?

To find out who the ‘leavers’ were, I passed a list of employees and a list of used codes to a function and then iterated over the list of codes, putting any employee who is not in the used code list into a levers list

def FindLeaversWithCodes(people, used_codes) :
    #Find people we've sent a code to who are no longer in the HC Report
    _leavers = []

    #Get the Employee IDs of all the people we've sent codes to.
    _people = used_codes['personNumber'].tolist()
    #loop through each employee id
    for person in _people:
        #if this employee id is in teh current HC report then save as Current Employee
        if person not in people['Person Number'].values: #if this employee id is NOT in the current HC report then save as Leaver
        _leavers.append(person)
    return _leavers

However, because I move things from a Dataframe to a list, I then had to filter the original Dataframe by the list of employee numbers, _leavers in the list to generate a new XSLX file in the original format. Incredibly inefficient.

try: 
    #Filter the used codes list to show the leavers
    df = used[used['personNumber'].isin(Leavers)]
    df.to_excel('../leavers.xlsx', index=False)
except Exception as error:
    print(error)

That first line of code df = used[used['personNumber'].isin(Leavers)] should have given me a clue as to what the FindLeaversWithCodes function should actually be doing.

Rather than convert to a list and use a for…in loop, I should have simply filtered the Dataframe to get the records I need.

Now the code is much simpler

def LeaversWithCodes(people, used_codes) :
    #This person has been assigned a code
    #This person is not an employee
    return used_codes[~used_codes["personNumber"].isin(people["Person Number"].values.tolist())]

the ~ character before the used_codes["personNumber"] is a ‘not’ symbol. I’m doing the same as before but this time filtering the original Dataframe where the personNumber is not in the People Dataframe.


Wordle: 570 4/6

56: Representing knowledge for AI

2023-01-10

  • I started the second lecture from the HarvardX course CS50’s Introduction to Artificial Intelligence with Python.
  • This time we’re learning about AI can help with understanding knowledge.
  • I wish I had paid attention to the logic classes I had when I was a kid.
  • I’m seriously going to have to concentrate on this one. The content is not specifically hard. It’s just a lot to internalise.
  • It’s been around 25 years since I did any formal learning (obviously, I’ve learnt from books and training courses since), so getting used to lectures has bee interesting.
  • I’ll write more when I’ve finished the video.

Wordle: 570 4/6

55: Digital Employee Experience.

2023-01-09

  • Today I started pulling together my thoughts on the ‘Employee Graph’ I want to integrate into our Employee Experience Platform.
  • The purpose of our Employee Experience platform is to apply a UX layer over the top of an organisation’s employee digital tools.
  • Alongside our more substantial products:
  • Intranet
  • Recognition
  • Performance Review
  • Rewards (Discount manager)
  • We also integrate into existing platforms such as Oracle HCM, and Microsoft 365.
  • However, the unique aspect of our solution is that we can aggregate the data from all these platforms and turn it into insights.
  • I want to make sure the insights are beneficial to the user (performance reviews, feedback), the line manager (team management), and the organisation (workforce planning)
  • So now I’m working through what this reporting dashboard could look like and the value it will return.

Wordle: 569 6/6

54: Lazy Sunday, HarvardX: introduction to AI.

2023-01-08

  • After yesterday’s fun, and not getting home until gone 01.00, we had a very quiet day today.
  • Our sole focus for the day was a new 1000-piece jigsaw puzzle, which, sadly, we didn’t finish by the time we stopped at 22.30.
  • Apart from that, I finished the first lecture of HarvardX’s CS50 introduction to artificial intelligence with Python.
  • The course is one of Harvard’s free lecture series released under the HarvardX brand and is, so far, a great introduction to the fundamentals of AI.
  • The first lecture is on search, where we learn about search problems.
  • We were introduced to several different search algorithms
  • Uninformed search is a general-purpose search algorithm which operates in a brute force way as it does not have additional information about the state other than how to traverse the tree (so it is also called blind search).
  • Depth-first Search. The search algorithm will follow the nodes to the deepest possible depth before evaluating the next available path.
  • Breath-first Search: The search algorithm will look at all child nodes before progress to the next level of the tree.
  • Informed search is a search algorithm that knows its end state and uses that information to estimate how close it is to its goal.
  • We learn about greedy best-first search and A* search.
  • Then we were introduced to adversarial search problems (think Tic-tac-toe), where we were introduced to the following algorithm
  • Minmax – we calculate all possible moves and assign a value. Player 1 will try to minimise their score to win, and player 2 will try to maximise thier score to win.
  • Alpha-beta pruning is where we calculate enough of the subsequent moves to see what has the highest or lowest score (depending on if you are minimising or maximising) before making a move.
  • Depth-limited Minmax where (in more complex games) we only calculate our next moves up to a limited depth (10 moves) and assign a probability that a move will be winning.
  • It is a well-put-together lecture, as you’d expect, and the content was new to me, which was awesome.
  • Next, we’ll learn how to extract knowledge from a body of text.

  • Reading: Agency by William Gibson (100% Complete)
  • Wordle: 568 0/6

54: Stranger Things Experience

2023-01-07

  • For Christmas, Ellie bought us tickets to The Stranger things experience in London.
  • So, once again, due to the Train Strikes, I found myself driving into central London.
  • First stop, I dropped off some things for Ellie, then we got the tube Brent Cross and walked to the show.
  • First, let me start by saying that, like most of the western world, I things Stranger Things is a masterpiece.
  • It’s produced one of the best scenes in television history. You know which one I mean, don’t you.
  • So, to say I was excited was an understatement.
  • The experience is split into two.
  • Firstly you participate in some immersive theatre.
  • Invited to Hawkins Lab to participate in a sleep study, you are welcome to the lab by some cranky scientists, all slightly unhinged.
  • Then you are briefed by a member of the PR team, and everything goes downhill from there!
  • I won’t ruin the surprise, but it was an enjoyable experience, both Ellie and I had fun, and there was indeed some draw-dropping moment.
  • Also, the action was led by the main series actors, which was pretty sweet.
  • Once the main show is over, you visit some of the locations from the series.
  • The setting reminded me of an exhibition setup, with a stall for each element.
  • It was pretty much a glorified gift shop with ice cream from Scoops Ahoy, Pizza from Surfer boy Pizza as well as merchandise to buy.
  • But that didn’t distract from the overall experience.
  • There were plenty of photo ops too which was fun.
  • All in all Ellie and I had a great time. I’ll give it a solid 4 out of 5.

  • Reading: Agency by William Gibson (100% Complete)
  • Wordle: 568 0/6

54: Data Processing (Pandas) and NLP (spaCy)

2023-01-06

  • This week I’ve been using Pandas to help me resolve an issue with the discount manager.
  • You might remember late last year, I successfully created a new script that would allow me to send adhoc discount code campaigns (this was after the discount manager app we built decided to stop working).
  • Well, this week, I was working on the ongoing campaign module.
  • This module will allow me to ingest a monthly employee data extract and then send a discount code to each new employee, discontinuing the codes of people who have left.
  • All pretty straightforward, but first, I needed to make sure that a campaign we were already running could be migrated from the old discount manager to this new approach.
  • I had to parse a list of everyone we sent codes to, and then I had to work out who had left and joined since the last time it ran.
  • To do this, I created a Python app on Jupiter notebooks in VS Code.
  • Anytime I need to parse, analyse or manipulate structured files (xlsx, CSV, JSON), I go straight for pandas.
  • The investment in learning Python is insignificant compared to the time I save being able to analyse the data.
  • While doing this, I wondered how I might use Python to do the same with instruct data.
  • Well, I found a library called spaCy which should help me do this through named entity recognition and entity linking.
  • I’m pretty new to this, and there’s a steep learning curve in understanding the lexicology behind written or spoken language, but as for the tools, spaCy is a joy to use.
  • With four lines of code, you can interrogate text and understand the different POS parts of speech.
  • With a few more lines of code, you can create matcher rules to find specific entities in the text.
  • I wish I had found it sooner, as it would help us create an even better search experience on our intranet project.
  • So I ended the work week on a high with a bunch of new ideas.
  • I’ll document the projects as I go, so if anyone’s interested, they can play along too.

  • Reading: Agency by William Gibson (56% Complete)
  • Wordle: 567 4/6

53: Frustrations

2023-01-05

  • Today was a day of frustration.
  • Irritations at work mainly, which is a reminder to myself that UXC is business, with employees, who have roles, with tasks which need to be fulfilled.
  • As much as i prefer to run things as flexibly as I can, I’m not doing myself or anyone else favour if I protect people from the reality of providing services and the expectations of doing the work.
  • Nothing major, just grit in the mill that needs to worked out.

  • Reading: Agency by William Gibson (53% Complete)
  • Wordle: 565 0/6

52: Simplicity

2023-01-04

  • Oh, for a bit of simplicity!
  • Today I had a catch-up with a good friend of mine, a long overdue meeting. We had some important D&D related things to discuss as he is joining our campaign but really it’s an opportunity for us to both provide some informal mentoring.
  • During our discussion today, we both ended up talking about our pursuit of simplicity in our working lives.
  • When you run your own business, it’s easy to make everything hideously complicated. Processes, meeting external expectations; this list is endless, but neither he or I started our business to be the next Elon Musk.
  • We’re creative entrepreneurs for sure, and we both operate at the top of our fields, but for us, the businesses are a means to an end. For him it’s to give him time to write. For me, it’s to explore new product ideas and concepts (most of which will never be viable)
  • He likes to write. I want to create new digital product concepts.
  • A decade or so ago, these business vehicles would be called mom-and-pop businesses. A pejorative term meant to separate the real business people from those who are happy to make a good living and be free from employment.
  • These days, thanks to millennials staking their place in the business world, they have much sexier names, such as multi-hyphenated businesses or lifestyle businesses and they are considered a viable and desired way of working.
  • The gist is that you will do many things to generate the income you desire rather than putting all your energy into a single business entity (risk diversification).
  • The point is that work is a means to give you time to do the things you want to do, like write or tinker with new product ideas.
  • Thinking about it, it’s similar to the 4-hour work week written by Timothy Ferris, except rather than creating meaningless muse projects to generate income so that you can be permanently retired, its goal is to make time to focus on the ‘work’ you love. (for some, it’s not feasible, or even desirable, to spend your life pursuing hobby-type activities)
  • So as we sat, eating and chatting, I realised its time to make my life easier and start to simplify my business and life goals. To enable me to work on the things I love and not just the business I have.

  • Reading: Agency by William Gibson (53% Complete)
  • Wordle: 560 -/6

51: Visibility

2023-01-03

  • When Gary and I wrote Ready for Remote, we uncovered three principles for remote work.
  • Clarity, Visibility and Trust
  • I usually list them in reverse order putting trust first as trust is fundamental.
  • But trust is given, then reinforced or diminished through our actions.
  • Clarity and Visibility are two of the forces that act on trust positively or negatively.
  • Visibility is so important in a remote working setting as remote is a ‘what you see is all there is’ environment.
  • If someone is visible, there are many opportunities to reinforce trust with our colleagues. You can tell someone is on top of their game, and you can tell that someone needs support.
  • If someone isn’t visible, you cannot tell from where you are what someone is doing.
  • Trust starts to erode because, in the absence of information, we invite the wrong sorts of questions such as ‘What’s Matt doing?’, ‘Is that work item complete or not?’
  • Being proactively visible is an antidote to businesses using spy wear to monitor their employees.
  • Visibility does not mean that we’re posting updates for update’s sake. It’s using the tools at hand to create progress indicators which telegraph to the rest of the team what you’re doing.
  • It’s moving cards on a kanban board and having work-related conversations in slack. It’s Publishing, and following up with meeting notes.
  • These activities create artefacts which create visibility.

  • Reading: Agency by William Gibson (48% Complete)
  • Wordle: 560 -/6

50: Back to work tomorrow.

2023-01-02

  • Today is the last day off for Christmas/New Year before we head back to work for the new year.
  • I’m looking forward to getting back to work.
  • My goals for this month are:
  • 1. Finish the Retail App
  • 2. Complete the migration project
  • 3. Complete the Design a better world website
  • 4. Completed The Alice Sound Website
  • Beyond that, my UX team will support Etch and we’ll finalise the Telstra health project.
  • It’s a lot to do, but we’ll get it all done, no worries.

  • Today was pretty quiet. Tam and I did some tidying while Ellie worked on her essay.
  • This evening Tam and I embarked on one of our ‘year of crafts’ projects (while Ellie had a little nap)
  • We’re knitting a blanket from a kit I bought Tam for Christmas. (I thought I had purchased a crochet kit, but I discovered I was wrong when the set arrived).
  • It’s been a very long time since either of us has knitted, but after brushing up via YouTube, it sort of came back to us. 🫤
  • I secretly enjoy knitting as it reminds me of my Mum and my Nan.
  • My mum is a brilliant knitter, and she’s tried several times to teach me, but it’s been around 20 years since I last took up a pair of needles.
  • And it turns out that is hard!
  • I’ve not yet rung my mum to help me understand the knitting pattern, but it won’t be long before I need her wisdom.

  • Reading: Agency by William Gibson (48% Complete)
  • Wordle: 562 5/6