Skip to main navigation Skip to search Skip to main content

Multi-output regression with structurally incomplete target labels: A case study of modelling global vegetation cover

  • University of Helsinki
  • University of Helsinki

Research output: Contribution to journalArticlepeer-review

Abstract

Weakly-supervised learning has recently emerged in the classification context where true labels are often scarce or unreliable. However, this learning setting has not yet been extensively analyzed for regression problems, which are typical in macroecology. We further define a novel computational setting of structurally noisy and incomplete target labels, which arises, for example, when the multi-output regression task defines a distribution such that outputs must sum up to unity. We propose an algorithmic approach to reduce noise in the target labels and improve predictions. We evaluate this setting with a case study in global vegetation modelling, which involves building a model to predict the distribution of vegetation cover from climatic conditions based on global remote sensing data. We compare the performance of the proposed approach to several incomplete target baselines. The results indicate that the error in the targets can be reduced by our proposed partial-imputation algorithm. We conclude that handling structural incompleteness in the target labels instead of using only complete observations for training helps to better capture global associations between vegetation and climate.

Original languageEnglish
Article number101849
JournalEcological Informatics
Volume72
DOIs
Publication statusPublished - 1 Dec 2022

Keywords

  • Compositional data
  • Incomplete targets
  • Multi-output regression
  • Vegetation cover modelling
  • Weakly supervised learning

Fingerprint

Dive into the research topics of 'Multi-output regression with structurally incomplete target labels: A case study of modelling global vegetation cover'. Together they form a unique fingerprint.

Cite this