Status
Done
- Missing Value Detection
df['series'].isna() → Boolean mask of missing values (alias:Âisnull())df['series'].notna() → Boolean mask of non-missing values (alias:Ânotnull())
2. Missing Value Removal
df['series'].dropna() → Remove missing values (returns new Series)
3. Missing Value Filling
df['series'].fillna(value) → Fill NA with specified value/methoddf['series'].ffill() → Forward fill (alias:Âpad())df['series'].bfill() → Backward fill (alias:Âbackfill())df['series'].interpolate() → Fill NA via interpolation
Key Notes:
isna/isnull andÂnotna/notnull are identical (use whichever you prefer)ffill/pad propagate last valid observation forwardbfill/backfill propagate next valid observation backwardinterpolate offers multiple methods (linear, polynomial, etc.)
Sample Dataset
import pandas as pd
import numpy as np
data = {
'temperature': [22, np.nan, 28, 24, np.nan, 23, 26, 29],
'ice_cream_sales': [110, 150, np.nan, 130, 170, 120, np.nan, 190],
'time_series': [10, np.nan, np.nan, 18, 14, 20, 16, 22], # Time-series with gaps
'inventory': [45, 50, 55, np.nan, np.nan, 60, 65, 70] # Inventory tracking
}
df = pd.DataFrame(data)
print(df) temperature ice_cream_sales time_series inventory
0 22.0 110.0 10.0 45.0
1 NaN 150.0 NaN 50.0
2 28.0 NaN NaN 55.0
3 24.0 130.0 18.0 NaN
4 NaN 170.0 14.0 NaN
5 23.0 120.0 20.0 60.0
6 26.0 NaN 16.0 65.0
7 29.0 190.0 22.0 70.01. Missing Value Detection
‣
1.1Â df['series'].isna()Â /Â df['series'].isnull()
‣
1.2Â df['series'].notna()Â /Â df['series'].notnull()
2. Missing Value Removal
‣
2.1Â df['series'].dropna()
3. Missing Value Filling
‣
3.1Â df['series'].fillna()
‣
3.2Â df['series'].ffill()Â /Â pad()
‣
3.3Â df['series'].bfill()Â /Â backfill()
‣
3.4Â df['series'].interpolate()
Summary Table
Method | Description | Example Use Case |
isna()Â /Â isnull() | Detect missing values | df['temp'].isna().sum() |
notna()Â /Â notnull() | Detect non-missing values | df[df['sales'].notna()] |
dropna() | Remove missing values | df['temp'].dropna() |
fillna() | Fill NA with value/method | df['temp'].fillna(25) |
ffill()Â /Â pad() | Forward fill | df['ts'].ffill() |
bfill()Â /Â backfill() | Backward fill | df['inventory'].bfill() |
interpolate() | Interpolate NA | df['temp'].interpolate() |
Key Practical Scenarios
- Time-Series Data:
- UseÂ
ffill()Â for sensor readings where gaps should inherit the last valid value. - UseÂ
interpolate(method='time')Â for irregular timestamps. - Business Data:
- Replace missing sales data with the column mean:
- Inventory Tracking:
- Backward fill to reflect the most recent stock count:
- Advanced Interpolation:
python
df['sales'].fillna(df['sales'].mean(), inplace=True)python
df['inventory'].bfill(inplace=True)python
df['temperature'].interpolate(method='polynomial', order=2) # Quadratic fitFinal Notes
- Performance:Â
ffill/bfill are faster thanÂinterpolate for large datasets. - Inplace Modification: UseÂ
inplace=True to modify the Series directly. - Chaining Methods: Combine operations like:
python
df['column'].fillna(0).interpolate().bfill()