Missing values are everywhere, and you don’t want them interfering with your work.
df.isna()
df.isna().any()
df.isna().sum()
import matplotlib.pyplot as plt
df.isna().sum().plot(kind="bar")
plt.show()
Example 1
# Import matplotlib.pyplot with alias plt
import matplotlib.pyplot as plt
# Check individual values for missing values
print(avocados_2016.isna())
# Check each column for missing values
print(avocados_2016.isna().any())
# Bar plot of missing values by variable
avocados_2016.isna().sum().plot(kind=
"bar")
# Show plot
plt.show()
One way to deal with missing values is to remove them from the dataset completely.
.dropna()
:df.dropna()
Example 2
# Remove rows with missing values
avocados_complete = avocados_2016.dropna()
# Check if any columns contain missing values
print(avocados_complete.isna().any())
Another way of handling missing values is to replace them all with the same value. For numerical variables, one option is to replace values with 0
df.fillna(0)