Sample size to evaluate the degree of multicollinearity in rye morphological traits

Authors

DOI:

https://doi.org/10.1590/1983-21252023v36n123rc

Keywords:

Correlation. Multivariate analysis. Sampling design. Secale cereale L.

Abstract

Investigation of multicollinearity allows parameters in multivariate analysis to be estimated with higher precision and with biological interpretation. In order to generate reliable estimates of the degree of multicollinearity, it is necessary to use appropriate sample size. Thus, the objectives of this study were to determine the sample size (number of plants) necessary to estimate the indicators of the degree of multicollinearity - condition number (CN), correlation matrix determinant (DET), and variance inflation factor (VIF) - in morphological traits of rye and to verify the variability of the sample size between the indicators. Five and three uniformity trials were conducted with the cultivars BRS Progresso and Temprano, respectively. Eight morphological traits were evaluated in 780 plants in eight trials. For each trial, 22 cases were selected among the 28 formed by the combination of eight traits, taken six by six, totaling 176 cases. In each case, 197 sample sizes were planned (20, 25, 30, ..., 1,000 plants) and in each size 2,000 resampling procedures with replacement were performed, CN, DET, and VIF were determined and the average among 2,000 estimates was calculated. For each case and indicator (CN, DET, and VIF), the sample size was determined through three models: modified maximum curvature method and linear and quadratic segmented models with plateau response. There is variability between sample sizes between indicators, with larger sample sizes required for DET, followed by CN and VIF, in that order, with at least 180, 116 and 85 plants, respectively.

Downloads

Download data is not yet available.

References

ABOU CHEHADE, L. et al. Rye (Secale cereale L.) and squarrose clover (Trifolium squarrosum L.) cover crops can increase their allelopathic potential for weed control when used mixed as dead mulch. Italian Journal of Agronomy, 16: 1–11, 2021.

ALVARES, C. A. et al. Köppen’s climate classification map for Brazil. Meteorologische Zeitschrift, 22: 711-728, 2013.

ALVES, B. M.; CARGNELUTTI FILHO, A.; BURIN, C. Multicollinearity in canonical correlation analysis in maize. Genetics and Molecular Research, 16: 1–14, 2017.

BAIER, A. C. Centeio. Passo Fundo, RS: EMBRAPA-CNPT, 1994. 29 p. (Documentos, 15).

BANDEIRA, C. T. et al. Sample size to estimate the mean of morphological traits of rye cultivars in sowing dates and evaluation times. Semina: Ciências Agrárias, 39: 521-532, 2018a.

BANDEIRA, C. T. et al. Sample sufficiency for estimation of the mean of rye traits at flowering stage. Journal of Agricultural Science, 10: 178-186, 2018b.

BASCHE, A. D. et al. Soil water improvements with the long-term use of a winter rye cover crop. Agricultural Water Management, 172: 40–50, 2016.

FIELD, A. Descobrindo a estatística utilizando o SPSS. 2 ed. Porto Alegre, RS: Artmed, 2009. 688 p.

FOLLMANN, D. N. et al. Correlations and path analysis in sunflower grown at lower elevations. Journal of Agricultural Science, 11: 445-453, 2019.

GUJARATI, D. N.; PORTER, D. C. Econometria básica. 5 ed. Porto Alegre, RS: AMGH Editora Ltda, 2011. 920 p.

JANMOHAMMADI, M.; SABAGHNIA, N.; NOURAEIN, M. Path Analysis of Grain Yield and Yield Components and Some Agronomic Traits in Bread Wheat. Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis, 62: 945-952, 2014.

LAIDIG, F. et al. Breeding progress, variation, and correlation of grain and quality traits in winter rye hybrid and population varieties and national on-farm progress in Germany over 26 years. Theoretical and Applied Genetics, 5: 981-998, 2017.

MEIER, V. D.; LESSMAN, K. J. Estimation of optimum field plot shape and size for testing yield in Crambe abyssinica Hochst. Crop Science, 11: 648-650, 1971.

MONTGOMERY, D. C.; PECK, E. A. VINNING, G. G. Introduction to linear regression analysis. New York: John Wiley and Sons, 2012. 672 p.

MORRISON, L. A. Cereals: Domestication of the Cereal Grains. Encyclopedia of Food Grains. 1: 86-98, 2016.

NOURAEIN, M. Elucidating seed yield and components in rye (Secale cereale L.) using path and correlation analyses. Genetic Resources and Crop Evolution, 66: 1533-1542, 2019.

OLIVOTO, T. et al. Optimal sample size and data arrangement method in estimating correlation matrices with lesser collinearity: A statistical focus in maize breeding. African Journal of Agricultural Research, 12: 93-103, 2017a.

OLIVOTO, T. et al. Multicollinearity in path analysis: A simple method to reduce its effects. Agronomy Journal, 109: 131-142, 2017b.

PAULINO, V. T.; CARVALHO, D. D. Pastagens de inverno. Revista Científica Eletrônica de Agronomia, 3: 1-6, 2004.

R TEAM CORE. R: A language and environment for statistical computing. R Foundation for Statistical Computing, 2019. Disponível em: <https://www.r-project.org/>. Acesso em: 12 dez. 2019.

SANTOS, H. G. et al. Brazilian Soil Classification System. 5 ed. Brasília, DF: EMBRAPA , 2018. 469 p.

SAPIRSTEIN, H. D.; BUSHUK, W. Rye Grain: Its Genetics, Production, and Utilization. Encyclopedia of Food Grains, 1: 159-167, 2016.

SARI, B. G. et al. Interference of sample size on multicollinearity diagnosis in path analysis. Pesquisa Agropecuária Brasileira, 53: 769-773, 2018.

TOEBE, M.; CARGNELUTTI FILHO, A. Não normalidade multivariada e multicolinearidade na análise de trilha em milho. Pesquisa Agropecuária Brasileira, 48: 466-477, 2013.

TOEBE, M. et al. Dimensionamento amostral e associação linear entre caracteres de Crotalaria spectabilis. Bragantia, 76: 45-53, 2017a.

TOEBE, M. et al. Direct effects on scenarios and types of path analyses in corn hybrids. Genetics and Molecular Research, 16: 1-15, 2017b.

Downloads

Published

01-12-2022

Issue

Section

Technical Note