Package: collinear 2.0.0

collinear: Automated Multicollinearity Management

Effortless multicollinearity management in data frames with both numeric and categorical variables for statistical and machine learning applications. The package simplifies multicollinearity analysis by combining four robust methods: 1) target encoding for categorical variables (Micci-Barreca, D. 2001 <doi:10.1145/507533.507538>); 2) automated feature prioritization to prevent key variable loss during filtering; 3) pairwise correlation for all variable combinations (numeric-numeric, numeric-categorical, categorical-categorical); and 4) fast computation of variance inflation factors.

Authors:Blas M. Benito [aut, cre, cph]

collinear_2.0.0.tar.gz
collinear_2.0.0.zip(r-4.5)collinear_2.0.0.zip(r-4.4)collinear_2.0.0.zip(r-4.3)
collinear_2.0.0.tgz(r-4.4-any)collinear_2.0.0.tgz(r-4.3-any)
collinear_2.0.0.tar.gz(r-4.5-noble)collinear_2.0.0.tar.gz(r-4.4-noble)
collinear_2.0.0.tgz(r-4.4-emscripten)collinear_2.0.0.tgz(r-4.3-emscripten)
collinear.pdf |collinear.html
collinear/json (API)
NEWS

# Install 'collinear' in R:
install.packages('collinear', repos = c('https://blasbenito.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

Bug tracker:https://github.com/blasbenito/collinear/issues

Datasets:
  • toy - One response and four predictors with varying levels of multicollinearity
  • vi - Example Data With Different Response and Predictor Types
  • vi_predictors - All Predictor Names in Example Data Frame vi
  • vi_predictors_categorical - All Categorical and Factor Predictor Names in Example Data Frame vi
  • vi_predictors_numeric - All Numeric Predictor Names in Example Data Frame vi

On CRAN:

machine-learningmulticollinearitystatistics

5.32 score 9 stars 33 scripts 286 downloads 58 exports 16 dependencies

Last updated 13 days agofrom:968df67bff. Checks:OK: 3 NOTE: 4. Indexed: yes.

TargetResultDate
Doc / VignettesOKNov 10 2024
R-4.5-winOKNov 10 2024
R-4.5-linuxOKNov 10 2024
R-4.4-winNOTENov 10 2024
R-4.4-macNOTENov 10 2024
R-4.3-winNOTENov 10 2024
R-4.3-macNOTENov 10 2024

Exports:add_white_noisecase_weightscollinearcor_categorical_vs_categoricalcor_clusterscor_cramer_vcor_dfcor_matrixcor_numeric_vs_categoricalcor_numeric_vs_numericcor_selectdrop_geometry_columnencoded_predictor_namef_auc_gam_binomialf_auc_glm_binomialf_auc_glm_binomial_poly2f_auc_rff_auc_rpartf_autof_auto_rulesf_functionsf_r2_gam_gaussianf_r2_gam_poissonf_r2_glm_gaussianf_r2_glm_gaussian_poly2f_r2_glm_poissonf_r2_glm_poisson_poly2f_r2_pearsonf_r2_rff_r2_rpartf_r2_spearmanf_vf_v_rf_categoricalidentify_predictorsidentify_predictors_categoricalidentify_predictors_numericidentify_predictors_typeidentify_predictors_zero_varianceidentify_response_typemodel_formulaperformance_score_aucperformance_score_r2performance_score_vpreference_orderpreference_order_collineartarget_encoding_labtarget_encoding_lootarget_encoding_meantarget_encoding_rankvalidate_data_corvalidate_data_vifvalidate_dfvalidate_encoding_argumentsvalidate_predictorsvalidate_preference_ordervalidate_responsevif_dfvif_select

Dependencies:codetoolsdigestfuturefuture.applyglobalslatticelistenvMatrixmgcvnlmeparallellyprogressrrangerRcppRcppEigenrpart

Readme and manuals

Help Manual

Help pageTopics
Add White Noise to Encoded Predictoradd_white_noise
Case Weights for Unbalanced Binomial or Categorical Responsescase_weights
Automated multicollinearity managementcollinear
Hierarchical Clustering from a Pairwise Correlation Matrixcor_clusters
Bias Corrected Cramer's Vcor_cramer_v
Pairwise Correlation Data Framecor_categorical_vs_categorical cor_df cor_numeric_vs_categorical cor_numeric_vs_numeric
Pairwise Correlation Matrixcor_matrix
Automated Multicollinearity Filtering with Pairwise Correlationscor_select
Removes geometry column in sf data framesdrop_geometry_column
Name of Target-Encoded Predictorencoded_predictor_name
Association Between a Binomial Response and a Continuous Predictorf_auc f_auc_gam_binomial f_auc_glm_binomial f_auc_glm_binomial_poly2 f_auc_rf f_auc_rpart
Select Function to Compute Preference Orderf_auto
Rules to Select Default f Argument to Compute Preference Orderf_auto_rules
Data Frame of Preference Functionsf_functions
Association Between a Continuous Response and a Continuous Predictorf_r2 f_r2_gam_gaussian f_r2_glm_gaussian f_r2_glm_gaussian_poly2 f_r2_pearson f_r2_rf f_r2_rpart f_r2_spearman
Association Between a Count Response and a Continuous Predictorf_r2_counts f_r2_gam_poisson f_r2_glm_poisson f_r2_glm_poisson_poly2
Association Between a Categorical Response and a Categorical Predictorf_v
Association Between a Categorical Response and a Categorical or Numeric Predictorf_v_rf_categorical
Identify Numeric and Categorical Predictorsidentify_predictors
Identify Valid Categorical Predictorsidentify_predictors_categorical
Identify Valid Numeric Predictorsidentify_predictors_numeric
Identify Predictor Typesidentify_predictors_type
Identify Zero and Near-Zero Variance Predictorsidentify_predictors_zero_variance
Identify Response Typeidentify_response_type
Generate Model Formulasmodel_formula
Area Under the Curve of Binomial Observations vs Probabilistic Model Predictionsperformance_score_auc
Pearson's R-squared of Observations vs Predictionsperformance_score_r2
Cramer's V of Observations vs Predictionsperformance_score_v
Quantitative Variable Prioritization for Multicollinearity Filteringpreference_order
Preference Order Argument in collinear()preference_order_collinear
Target Encoding Lab: Transform Categorical Variables to Numerictarget_encoding_lab
Target Encoding Methodstarget_encoding_loo target_encoding_mean target_encoding_rank
One response and four predictors with varying levels of multicollinearitytoy
Validate Data for Correlation Analysisvalidate_data_cor
Validate Data for VIF Analysisvalidate_data_vif
Validate Argument dfvalidate_df
Validates Arguments of 'target_encoding_lab()'validate_encoding_arguments
Validate Argument predictorsvalidate_predictors
Validate Argument preference_ordervalidate_preference_order
Validate Argument responsevalidate_response
Example Data With Different Response and Predictor Typesvi
All Predictor Names in Example Data Frame vivi_predictors
All Categorical and Factor Predictor Names in Example Data Frame vivi_predictors_categorical
All Numeric Predictor Names in Example Data Frame vivi_predictors_numeric
Variance Inflation Factorvif_df
Automated Multicollinearity Filtering with Variance Inflation Factorsvif_select