Umeå universitets logga

umu.sePublikationer
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
A unified framework for tabular generative modeling: loss functions, benchmarks, and improved multi-objective bayesian optimization approaches
Umeå universitet, Medicinska fakulteten, Institutionen för diagnostik och intervention.ORCID-id: 0000-0002-2391-1419
Umeå universitet, Teknisk-naturvetenskapliga fakulteten, Institutionen för fysik.ORCID-id: 0000-0001-5420-0591
Umeå universitet, Medicinska fakulteten, Institutionen för diagnostik och intervention.ORCID-id: 0000-0001-8851-2905
Umeå universitet, Teknisk-naturvetenskapliga fakulteten, Institutionen för datavetenskap.ORCID-id: 0000-0001-7119-7646
Visa övriga samt affilieringar
2025 (Engelska)Ingår i: Transactions on Machine Learning Research, E-ISSN 2835-8856, Vol. 12Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

Deep learning (DL) models require extensive data to achieve strong performance and generalization. Deep generative models (DGMs) offer a solution by synthesizing data. Yet current approaches for tabular data often fail to preserve feature correlations and distributions during training, struggle with multi-metric hyperparameter selection, and lack comprehensive evaluation protocols. We address this gap with a unified framework that integrates training, hyperparameter tuning, and evaluation. First, we introduce a novel correlation- and distribution-aware loss function that regularizes DGMs, enhancing their ability to generate synthetic tabular data that faithfully represents the underlying data distributions. Theoretical analysis establishes stability and consistency guarantees. To enable principled hyper-parameter search via Bayesian optimization (BO), we also propose a new multi-objective aggregation strategy based on iterative objective refinement Bayesian optimization (IORBO), along with a comprehensive statistical testing framework. We validate the proposed approach using a benchmarking framework with twenty real-world datasets and ten established tabular DGM baselines. The correlation-aware loss function significantly improves the synthetic data fidelity and downstream machine learning (ML) performance, while IORBO consistently outperforms standard Bayesian optimization (SBO) in hyper-parameter selection. The unified framework advances tabular generative modeling beyond isolated method improvements. Code is available at: https://github.com/vuhoangminh/TabGen-Framework.

Ort, förlag, år, upplaga, sidor
Transactions on Machine Learning Research , 2025. Vol. 12
Nationell ämneskategori
Artificiell intelligens
Identifikatorer
URN: urn:nbn:se:umu:diva-249190OAI: oai:DiVA.org:umu-249190DiVA, id: diva2:2033602
Tillgänglig från: 2026-01-29 Skapad: 2026-01-29 Senast uppdaterad: 2026-02-02Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

Publisher's full text

Person

Vu, Minh HoangEdler, DanielWibom, CarlMelin, Beatrice S.Rosvall, Martin

Sök vidare i DiVA

Av författaren/redaktören
Vu, Minh HoangEdler, DanielWibom, CarlLöfstedt, TommyMelin, Beatrice S.Rosvall, Martin
Av organisationen
Institutionen för diagnostik och interventionInstitutionen för fysikInstitutionen för datavetenskap
I samma tidskrift
Transactions on Machine Learning Research
Artificiell intelligens

Sök vidare utanför DiVA

GoogleGoogle Scholar

urn-nbn

Altmetricpoäng

urn-nbn
Totalt: 35 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf