We can use lambda functions to perform complex operations on columns.
Example 1
We can create a new column last_name from this table
| name | ||
|---|---|---|
| 0 | Jane Doe | jdoe@gmail.com |
| 1 | John Smith | jsmith@gmail.com |
| 2 | Lara Lane | laral@gmail.com |
get_last_name = lambda x: x.split()[-1]
df['last_name'] = df.name.apply(get_last_name)
| name | last_name | ||
|---|---|---|---|
| 0 | Jane Doe | jdoe@gmail.com | Doe |
| 1 | John Smith | jsmith@gmail.com | Smith |
| 2 | Lara Lane | laral@gmail.com | Lane |
Example 2
We calculate the 25th percent for shoe price for each shoe_color:
cheap_shoes = orders.groupby('shoe_color').price.apply(lambda x: np.percentile(x, 25)).reset_index()
print(cheap_shoes)
| shoe_color | price | |
|---|---|---|
| 0 | black | 130.0 |
| 1 | brown | 248.0 |
| 2 | navy | 200.0 |
| 3 | red | 157.0 |
| 4 | white | 188.0 |
To access particular values of the row, we use the syntax row.column_name or row[‘column_name’].
If we use apply without specifying a single column and add the argument axis=1, the input to our lambda function will be an entire row, not a column
df['Price with Tax'] = df.apply(lambda row:
row['Price'] * 1.075
if row['Is taxed?'] == 'Yes'
else row['Price'],
axis=1
)