this network learned 32 bit numbers

outerdim = 32

hiddendim = 3072

save_path='model_weights.pt'

numlayers = 2

batch_size = 100

it seems when we double the outerdim, hidden dim needs to be multiplied with quite bit larger number, at least when 2 layers are used. this scalability may not suffice with 256 bit numbers, but its possible adding more layers improves scalability.

Reply to this note

Please Login to reply.

Discussion

No replies yet.