nsa

nsa@kbin.social

10 Posts
4 Comments

Joined 1 year ago

Cake day: June 15th, 2023

You are not logged in. If you use a Fediverse account that is able to follow users, you can follow this user.

OverviewCommentsPosts

nsa@kbin.social to

PC Gaming@kbin.social · 1 year ago

pl.aiwright - GPT-4 dialogue for Disco Elysium: The Final Cut

pl.aiwright.dev

0

2

pl.aiwright - GPT-4 dialogue for Disco Elysium: The Final Cut

pl.aiwright.dev

nsa@kbin.social to

PC Gaming@kbin.social · 1 year ago

0

nsa@kbin.social to

Machine Learning@kbin.social · 1 year ago

What's In My Big Data?

0

2

What's In My Big Data?

nsa@kbin.social to

Machine Learning@kbin.social · 1 year ago

0

nsa@kbin.social to

Machine Learning@kbin.social · 1 year ago

The Data Provenance Initiative: A Large Scale Audit of Dataset Licensing & Attribution in AI

0

2

The Data Provenance Initiative: A Large Scale Audit of Dataset Licensing & Attribution in AI

nsa@kbin.social to

Machine Learning@kbin.social · 1 year ago

0

nsa@kbin.social to

Machine Learning@kbin.social · 1 year ago

GPT-4 Doesn't Know It's Wrong: An Analysis of Iterative Prompting for Reasoning Problems

0

1

GPT-4 Doesn't Know It's Wrong: An Analysis of Iterative Prompting for Reasoning Problems

nsa@kbin.social to

Machine Learning@kbin.social · 1 year ago

0

nsa@kbin.social to

Machine Learning@kbin.social · 1 year ago

A Long Way to Go: Investigating Length Correlations in RLHF

0

3

A Long Way to Go: Investigating Length Correlations in RLHF

nsa@kbin.social to

Machine Learning@kbin.social · 1 year ago

0

nsa@kbin.social to

Machine Learning@kbin.social · 1 year ago

Think before you speak: Training Language Models With Pause Tokens

2

4

Think before you speak: Training Language Models With Pause Tokens

nsa@kbin.social to

Machine Learning@kbin.social · 1 year ago

2

nsa@kbin.social to

Machine Learning@kbin.social · 1 year ago

Language Modeling Is Compression

0

4

Language Modeling Is Compression

nsa@kbin.social to

Machine Learning@kbin.social · 1 year ago

0

nsa@kbin.social to

Machine Learning@kbin.social · 1 year ago

Retentive Network: A Successor to Transformer for Large Language Models

4

7

Retentive Network: A Successor to Transformer for Large Language Models

nsa@kbin.social to

Machine Learning@kbin.social · 1 year ago

4

nsa@kbin.socialtoMachine Learning@kbin.social•Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
link
fedilink
arrow-up
1·
1 year ago
Averaging model weights seems to help across textual domains as well, see Branch-Train-Merge: Embarrassingly Parallel Training of Expert Language Models and Scaling Expert Language Models with Unsupervised Domain Discovery. I wonder if the two types of averaging (across hyperparameters and across domains) can be combined to produce even better models.

link
fedilink

nsa@kbin.social to

Machine Learning@kbin.social · 1 year ago

CoDi: Generate Anything from Anything All At Once through Composable Diffusion

codi-gen.github.io

0

4

CoDi: Generate Anything from Anything All At Once through Composable Diffusion

codi-gen.github.io

nsa@kbin.social to

Machine Learning@kbin.social · 1 year ago

0

nsa@kbin.socialtoMachine Learning@kbin.social•Hardwiring ViT Patch Selectivity into CNNs using Patch Mixing
link
fedilink
arrow-up
1·
1 year ago
That’s appreciated!

link
fedilink

nsa@kbin.socialtoMachine Learning@kbin.social•Hardwiring ViT Patch Selectivity into CNNs using Patch Mixing
link
fedilink
arrow-up
0·
1 year ago
If there isn’t any discussion on reddit (no discussion in this case), I don’t see a reason to link to reddit; you can just link to the project page. That said, if you think there is important discussion happening that is helpful for understanding the paper, then use a teddit link instead, like:

https://teddit.net/r/MachineLearning/comments/14pq5mq/r_hardwiring_vit_patch_selectivity_into_cnns/

link
fedilink

nsa@kbin.socialtoMachine Learning@kbin.social•Hardwiring ViT Patch Selectivity into CNNs using Patch Mixing
link
fedilink
arrow-up
0·
1 year ago
Please don’t post links to reddit.

link
fedilink

nsa@kbin.social to

Machine Learning@kbin.social · 1 year ago

Inverse Scaling: When Bigger Isn't Better

0

3

Inverse Scaling: When Bigger Isn't Better

nsa@kbin.social to

Machine Learning@kbin.social · 1 year ago

0