Created
Sep 1, 2025 10:14 PM
Multi-select
Status
In progress

1. Basic DataFrame Properties 🛠️
Understand your DataFrame’s structure at a glance.
df.index
→ Returns the row index labels of the DataFrame.df.columns
→ Returns the column labels as an Index.df.dtypes
→ Returns a Series with each column’s data type.df.axes
→ Returns a list with the row axis and column axis.df.shape
→ Returns a tuple(rows, columns)
representing dimensions.df.ndim
→ Returns the number of dimensions (always 2 for DataFrame).df.size
→ Returns total number of elements (rows × columns
).df.memory_usage()
→ Shows memory usage for each column plus index.df.empty
→ ReturnsTrue
if DataFrame has no elements.df.attrs
→ Dictionary for storing custom user metadata.df.info()
→ Prints concise summary: index dtype, columns, non-null counts.
2. Data Type & Object Management 🛠️
Check, infer, or change your DataFrame’s column data types.
df.astype()
→ Convert the dtype of one or more columns to a specified type.df.convert_dtypes()
→ Convert columns to best possible dtypes automatically.df.infer_objects()
→ Infer better dtypes for object columns.df.copy()
→ Create a deep or shallow copy to avoid modifying original data.df.set_flags()
→ Set internal metadata flags.df.flags
→ Inspect DataFrame’s flags (rarely used).df.attrs
→ Same as above; user-defined metadata for extra context.
3. Label-Based & Positional Access 🛠️
Access cells, rows, or slices by label or position.
df.at[row_label, col_label]
→ Fast label-based scalar access (single cell).df.iat[row_index, col_index]
→ Fast position-based scalar access.df.loc[]
→ Select rows/columns by label; supports slices, conditions, or lists.df.iloc[]
→ Select rows/columns by integer position.
Key: at
/iat
→ single cell (fast). loc
/iloc
→ slices or multiple rows/columns.
4. Iteration & Basic Loops 🛠️
Loop through columns or rows.
df.__iter__
→ Dunder method for iterating over columns (not used directly).df.items()
→ Iterate over(column_name, Series)
pairs.df.keys()
→ Alias fordf.columns
; returns column labels.df.iterrows()
→ Iterate over(index, row Series)
pairs; convenient but slower.df.itertuples()
→ Iterate rows as namedtuples; faster thaniterrows
.
5. Quick Inspection & Conversion 🛠️
Peek at data or convert to NumPy.
df.head(n)
→ Return firstn
rows (default 5).df.tail(n)
→ Return lastn
rows.df.values
→ Return DataFrame values as a NumPy array (legacy).df.to_numpy()
→ Preferred way to convert DataFrame to NumPy array.
6. Math, Binary Operations & Comparison 🛠️
Element-wise math, matrix dot products, and value-wise comparison.
Arithmetic
df.add()
ordf.__add__
→ Add element-wise; supports fill_value.df.sub()
ordf.__sub__
→ Subtract element-wise.df.mul()
ordf.__mul__
→ Multiply element-wise.df.div()
ordf.truediv()
→ Divide element-wise (true division).df.floordiv()
→ Floor division.df.mod()
→ Modulo.df.pow()
→ Exponentiate.df.dot()
→ Matrix multiplication.df.radd()
,df.rsub()
,df.rmul()
,df.rdiv()
,df.rtruediv()
,df.rfloordiv()
,df.rmod()
,df.rpow()
→ Reverse operations.
Comparison
df.lt()
→ Element-wise less than.df.gt()
→ Greater than.df.le()
→ Less than or equal.df.ge()
→ Greater than or equal.df.eq()
→ Equal to.df.ne()
→ Not equal to.
Combine
df.combine(other, func)
→ Combine two DataFrames element-wise using a function.df.combine_first(other)
→ Fill missing values withother
.
7. Function Application
Apply functions row-wise, column-wise, element-wise, or via a clean pipe.
df.apply(func, axis=0)
→ Apply function along an axis (0
= columns,1
= rows).df.applymap(func)
→ Apply function element-wise.df.agg()
ordf.aggregate()
→ Aggregate using one or more operations.df.transform()
→ Transform rows/columns; shape is preserved.df.pipe(func)
→ Pipe DataFrame through a custom function.
8. Aggregation & Descriptive Statistics
Describe or summarize your data.
df.sum()
→ Sum of values.df.mean()
→ Mean value.df.std()
→ Standard deviation.df.var()
→ Variance.df.count()
→ Count non-NA cells.df.min()
→ Minimum value.df.max()
→ Maximum value.df.median()
→ Median value.df.mode()
→ Mode(s).df.prod()
ordf.product()
→ Product of values.df.cumsum()
→ Cumulative sum.df.cumprod()
→ Cumulative product.df.cummax()
→ Cumulative max.df.cummin()
→ Cumulative min.df.rank()
→ Rank values.df.quantile()
→ Return value at specified quantile.df.pct_change()
→ Percent change over previous row.df.kurt()
ordf.kurtosis()
→ Kurtosis.df.skew()
→ Skewness.df.sem()
→ Standard error of mean.df.describe()
→ Generate descriptive statistics summary.df.corr()
→ Correlation matrix.df.cov()
→ Covariance matrix.df.corrwith(other)
→ Correlation with another DataFrame.df.nunique()
→ Count distinct elements.df.value_counts()
→ Count unique value frequencies.
9. Filtering & Conditional
Filter rows conditionally.
df.isin(values)
→ Check if each element is invalues
.df.where(cond)
→ Replace where condition isFalse
.df.mask(cond)
→ Replace where condition isTrue
.df.query(expr)
→ Query DataFrame with string expression.
10. Reshaping & Pivoting
Switch between wide and long forms.
df.melt()
→ Unpivot columns to rows (wide → long).df.pivot()
→ Reshape long to wide; unique index/column pairs.df.pivot_table()
→ Spreadsheet-style pivot with aggregation.df.stack()
→ Pivot columns into index (wide → long).df.unstack()
→ Pivot index levels into columns (long → wide).df.explode()
→ Transform list-like values to separate rows.
11. Missing Data & Cleaning
Detect, drop, or fill NaNs and duplicates.
df.isna()
ordf.isnull()
→ Detect missing values.df.notna()
ordf.notnull()
→ Detect non-missing.df.fillna(value)
→ Fill NaNs with a value.df.dropna()
→ Drop rows/columns with NaNs.df.ffill()
ordf.pad()
→ Forward-fill missing.df.bfill()
ordf.backfill()
→ Backward-fill.df.duplicated()
→ Mark duplicate rows.df.drop_duplicates()
→ Drop duplicate rows.
12. Merge, Join & Combine
Combine multiple DataFrames.
df.merge()
→ SQL-style joins.df.join()
→ Join columns using index or key.df.update()
→ Update in place using non-NA values from another DataFrame.
13. Export & IO
Save DataFrame or convert to Python objects.
df.to_csv()
→ Save as CSV.df.to_excel()
→ Save as Excel file.df.to_json()
→ Save as JSON.df.to_pickle()
→ Serialize as pickle.df.to_sql()
→ Write to SQL database.df.to_dict()
→ Convert to dictionary.df.to_numpy()
→ Convert to NumPy array.
Aliases & Dunder Reminders
agg
=aggregate
kurt
=kurtosis
prod
=product
ffill
=pad
bfill
=backfill
isna
=isnull
notna
=notnull
div
=truediv
__add__
etc. = dunder methods; useadd()
,sub()
, etc. instead.
‣