Today I had one of those, aha! moments with Pandas which has made my life a lot simpler.
Here’s the story.
I have an excel file which contains a list of discount codes, some have been used, and some have not. I need to go through the list of codes and find out the following:
- Have we assigned a code to anyone who has now left the business?
To find out who the ‘leavers’ were, I passed a list of employees and a list of used codes to a function and then iterated over the list of codes, putting any employee who is not in the used code list into a levers list
def FindLeaversWithCodes(people, used_codes) :
#Find people we've sent a code to who are no longer in the HC Report
_leavers = 
#Get the Employee IDs of all the people we've sent codes to.
_people = used_codes['personNumber'].tolist()
#loop through each employee id
for person in _people:
#if this employee id is in teh current HC report then save as Current Employee
if person not in people['Person Number'].values: #if this employee id is NOT in the current HC report then save as Leaver
However, because I move things from a Dataframe to a list, I then had to filter the original Dataframe by the list of employee numbers,
_leavers in the list to generate a new XSLX file in the original format. Incredibly inefficient.
#Filter the used codes list to show the leavers
df = used[used['personNumber'].isin(Leavers)]
except Exception as error:
That first line of code
df = used[used['personNumber'].isin(Leavers)] should have given me a clue as to what the
FindLeaversWithCodes function should actually be doing.
Rather than convert to a list and use a for…in loop, I should have simply filtered the Dataframe to get the records I need.
Now the code is much simpler
def LeaversWithCodes(people, used_codes) :
#This person has been assigned a code
#This person is not an employee
return used_codes[~used_codes["personNumber"].isin(people["Person Number"].values.tolist())]
~ character before the
used_codes["personNumber"] is a ‘not’ symbol. I’m doing the same as before but this time filtering the original Dataframe where the personNumber is not in the People Dataframe.
Wordle: 570 4/6