Discussion
Loading...

Post

  • About
  • Code of conduct
  • Privacy
  • Users
  • Instances
  • About Bonfire
Christophe Cerisara
@cerisara@mastodon.online  ·  activity timestamp yesterday

Sparse batched finetuning of #LLM beats SOTA model editing methods to acquire a small number of pieces of knowledge and generalize; this closes the debate of whether SFT can learn knowledge. Yet, full FT fails to acquire few such knowledge triplets, while sparse FT suceeds: there should be some continuum to explore there between full-FT on large datasets and sparse-FT on few facts only.

https://arxiv.org/pdf/2509.22072

https://arxiv.org/pdf/2509.22072
  • Copy link
  • Flag this post
  • Block
Log in

Open Science

We are a network of scientists, developers and organizations building the next generation of digital spaces for open science.

Open Science: About · Code of conduct · Privacy · Users · Instances
Bonfire open science · 1.0.0-rc.2.30 no JS en
Automatic federation enabled
  • Explore
  • About
  • Members
  • Code of Conduct
Home
Login