-
Notifications
You must be signed in to change notification settings - Fork 278
Pull requests: huggingface/datatrove
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
refactor(filters): use Counter(x) fast path in GopherRepetitionFilter
#493
opened Jun 28, 2026 by
mattfaltyn
Loading…
3 of 4 tasks
Fix extractor sandbox hang when the extract function raises an error
#492
opened Jun 13, 2026 by
discobot
Loading…
Fix incomplete stats.json when relaunching a LocalPipelineExecutor job
#491
opened Jun 13, 2026 by
discobot
Loading…
Require explicit eos_token in MegatronDocumentTokenizer (and fail clearly on an EOS missing from the vocab)
#489
opened Jun 3, 2026 by
maxsloef-goodfire
Loading…
chore: enable Dependabot weekly GitHub Actions bumps
dependabot
#487
opened May 26, 2026 by
hf-dependantbot-rollout
Bot
Loading…
feat: Add full inference server response to
InferenceResult
#410
opened Dec 19, 2025 by
hynky1999
Collaborator
Loading…
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.