How reliable is the multi-criteria evaluation system of the Welfare Quality® protocol for growing pigs?
This paper focuses on the reliability of the multi-criteria evaluation model included in the Welfare Quality® protocol for growing pigs to aggregate the animal-based indicators, first to criteria, then to principle level and finally to an overall welfare score. This assessment was carried out in a practical application study on a sample of 24 farms in Germany. Altogether, 102 protocol assessments were carried out in repeated visits to these farms in order to evaluate the inter-observer and test-retest repeatability of the overall scores calculated by the multi-criteria evaluation system. Reliability is then assessed by the calculation of different reliability and agreement parameters: Spearman Rank Correlation Coefficients (RS), Intraclass Correlation Coefficients (ICC), Smallest Detectable Changes (SDC) and Limits of Agreement (LoA). Inter-observer repeatability was insufficient for the criteria comfort around resting, absence of injuries, expression of social behaviours, expression of other behaviours, good human-animal relationship and positive emotional state as well as for the principles good housing and appropriate behaviour. This is probably due in the main to insufficient repeatability of the underlying indicators that have been revealed in previous studies. Test-retest repeatability is predominantly insufficient. Overall, the present results highlight the importance of absolutely reliable indicators at the baseline level. Furthermore, it could be shown that the calculation procedure is partly incorrect and consequently needs correction. Therefore, this study is an important contribution to the future progression of the Welfare Quality® protocols and animal welfare assessment tools in general.