Created
Nov 13, 2025 1:36 AM
Tags
Component | Primary Function |
tidykit/ | Core package with 20+ cleaning functions |
init.py | Package interface & convenience classes |
core.py | All data cleaning implementations |
setup.py | Package installation configuration |
sample_data/ | Test datasets for demonstrations |
tidykit_demo.ipynb | Interactive tutorial and examples |
setup_tidykit.sh | One-command automated setup |
requirements.txt | Python dependencies list |
README.md | Main project documentation |
SETUP_GUIDE.md | Installation instructions |
QUICK_REFERENCE.md | Function quick reference |
LICENSE | Legal permissions (MIT) |
Root Directory Structure
tidykit/ - Main Package Directory
This is the core Python package containing all data cleaning functionality.
Files in tidykit/:
‣
1. init.py - Package Interface
‣
2. core.py - Core Functionality
‣
3. setup.py - Package Configuration
‣
4. README.md - Package Documentation
‣
5. LICENSE - Legal
‣
6. .gitignore - Git Configuration
‣
Subdirectories in tidykit/:
Set Up
Function Categories:
- Column Headers:
clean_column_headers(),make_unique_columns() - Numeric Data:
clean_numeric_column() - Duplicates:
remove_duplicates() - Missing Values:
fill_missing() - Outliers:
remove_outliers_iqr(),remove_outliers_zscore(),detect_outliers_iqr(),cap_outliers() - Text Processing:
clean_text_column(),standardize_text_values() - Date/Time:
clean_date_column(),extract_date_features() - Data Types:
convert_data_types(),validate_data_ranges() - Categorical:
clean_categorical_column(),encode_categorical_variables() - Quality:
get_data_summary(),check_data_consistency() - Convenience:
quick_clean()- one-line cleaning pipeline - Information:
info()- displays comprehensive function reference
Include Quick Check on data
‣
clean_column_headers( )
‣